All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/5] DRM cgroup controller
@ 2018-11-20 18:58 Kenny Ho
  2018-11-20 18:58 ` [PATCH RFC 1/5] cgroup: Introduce cgroup for drm subsystem Kenny Ho
                   ` (4 more replies)
  0 siblings, 5 replies; 80+ messages in thread
From: Kenny Ho @ 2018-11-20 18:58 UTC (permalink / raw)
  To: y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, intel-gfx

The purpose of this patch series is to start a discussion for a generic cgroup
controller for the drm subsystem.  The design proposed here is a very early one.
We are hoping to engage the community as we develop the idea.


Backgrounds
==========
Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
tasks, and all their future children, into hierarchical groups with specialized
behaviour, such as accounting/limiting the resources which processes in a cgroup
can access[1].  Weights, limits, protections, allocations are the main resource
distribution models.  Existing cgroup controllers includes cpu, memory, io,
rdma, and more.  cgroup is one of the foundational technologies that enables the
popular container application deployment and management method.

Direct Rendering Manager/drm contains code intended to support the needs of
complex graphics devices. Graphics drivers in the kernel may make use of DRM
functions to make tasks like memory management, interrupt handling and DMA
easier, and provide a uniform interface to applications.  The DRM has also
developed beyond traditional graphics applications to support compute/GPGPU
applications.


Motivations
=========
As GPU grow beyond the realm of desktop/workstation graphics into areas like
data center clusters and IoT, there are increasing needs to monitor and regulate
GPU as a resource like cpu, memory and io.

Matt Roper from Intel began working on similar idea in early 2018 [2] for the
purpose of managing GPU priority using the cgroup hierarchy.  While that
particular use case may not warrant a standalone drm cgroup controller, there
are other use cases where having one can be useful [3].  Monitoring GPU
resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
(execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
sysadmins get a better understanding of the applications usage profile.  Further
usage regulations of the aforementioned resources can also help sysadmins
optimize workload deployment on limited GPU resources.

With the increased importance of machine learning, data science and other
cloud-based applications, GPUs are already in production use in data centers
today [5,6,7].  Existing GPU resource management is very course grain, however,
as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
alternative is to use GPU virtualization (with or without SRIOV) but it
generally acts on the entire GPU instead of the specific resources in a GPU.
With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
resource management (in addition to what may be available via GPU
virtualization.)

In addition to production use, the DRM cgroup can also help with testing
graphics application robustness by providing a mean to artificially limit DRM
resources availble to the applications.

Challenges
========
While there are common infrastructure in DRM that is shared across many vendors
(the scheduler [4] for example), there are also aspects of DRM that are vendor
specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
handle different kinds of cgroup controller.

Resources for DRM are also often device (GPU) specific instead of system
specific and a system may contain more than one GPU.  For this, we borrowed some
of the ideas from RDMA cgroup controller.

Approach
=======
To experiment with the idea of a DRM cgroup, we would like to start with basic
accounting and statistics, then continue to iterate and add regulating
mechanisms into the driver.

[1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
[2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
[3] https://www.spinics.net/lists/cgroups/msg20720.html
[4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
[5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
[6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
[7] https://github.com/RadeonOpenCompute/k8s-device-plugin
[8] https://github.com/kubernetes/kubernetes/issues/52757


Kenny Ho (5):
  cgroup: Introduce cgroup for drm subsystem
  cgroup: Add mechanism to register vendor specific DRM devices
  drm/amdgpu: Add DRM cgroup support for AMD devices
  drm/amdgpu: Add accounting of command submission via DRM cgroup
  drm/amdgpu: Add accounting of buffer object creation request via DRM
    cgroup

 drivers/gpu/drm/amd/amdgpu/Makefile         |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 147 ++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  27 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c     |  13 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c    |  15 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |   5 +-
 include/drm/drm_cgroup.h                    |  39 ++++++
 include/drm/drmcgrp_vendors.h               |   8 ++
 include/linux/cgroup_drm.h                  |  58 ++++++++
 include/linux/cgroup_subsys.h               |   4 +
 include/uapi/drm/amdgpu_drm.h               |  24 +++-
 init/Kconfig                                |   5 +
 kernel/cgroup/Makefile                      |   1 +
 kernel/cgroup/drm.c                         | 130 +++++++++++++++++
 16 files changed, 484 insertions(+), 7 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
 create mode 100644 include/drm/drm_cgroup.h
 create mode 100644 include/drm/drmcgrp_vendors.h
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

-- 
2.19.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH RFC 1/5] cgroup: Introduce cgroup for drm subsystem
  2018-11-20 18:58 [PATCH RFC 0/5] DRM cgroup controller Kenny Ho
@ 2018-11-20 18:58 ` Kenny Ho
       [not found] ` <20181120185814.13362-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2018-11-20 18:58 UTC (permalink / raw)
  To: y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, intel-gfx

Change-Id: I6830d3990f63f0c13abeba29b1d330cf28882831
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 include/linux/cgroup_drm.h    | 32 ++++++++++++++++++++++++
 include/linux/cgroup_subsys.h |  4 +++
 init/Kconfig                  |  5 ++++
 kernel/cgroup/Makefile        |  1 +
 kernel/cgroup/drm.c           | 46 +++++++++++++++++++++++++++++++++++
 5 files changed, 88 insertions(+)
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
new file mode 100644
index 000000000000..79ab38b0f46d
--- /dev/null
+++ b/include/linux/cgroup_drm.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#ifndef _CGROUP_DRM_H
+#define _CGROUP_DRM_H
+
+#ifdef CONFIG_CGROUP_DRM
+
+#include <linux/cgroup.h>
+
+struct drmcgrp {
+	struct cgroup_subsys_state	css;
+};
+
+static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct drmcgrp, css) : NULL;
+}
+
+static inline struct drmcgrp *get_drmcgrp(struct task_struct *task)
+{
+	return css_drmcgrp(task_get_css(task, drm_cgrp_id));
+}
+
+
+static inline struct drmcgrp *parent_drmcgrp(struct drmcgrp *cg)
+{
+	return css_drmcgrp(cg->css.parent);
+}
+
+#endif	/* CONFIG_CGROUP_DRM */
+#endif	/* _CGROUP_DRM_H */
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index acb77dcff3b4..ddedad809e8b 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -61,6 +61,10 @@ SUBSYS(pids)
 SUBSYS(rdma)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_DRM)
+SUBSYS(drm)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index a4112e95724a..bee1e164443a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -836,6 +836,11 @@ config CGROUP_RDMA
 	  Attaching processes with active RDMA resources to the cgroup
 	  hierarchy is allowed even if can cross the hierarchy's limit.
 
+config CGROUP_DRM
+	bool "DRM controller (EXPERIMENTAL)"
+	help
+	  Provides accounting and enforcement of resources in the DRM subsystem.
+
 config CGROUP_FREEZER
 	bool "Freezer controller"
 	help
diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index bfcdae896122..6af14bd93050 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -4,5 +4,6 @@ obj-y := cgroup.o rstat.o namespace.o cgroup-v1.o
 obj-$(CONFIG_CGROUP_FREEZER) += freezer.o
 obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
+obj-$(CONFIG_CGROUP_DRM) += drm.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_CGROUP_DEBUG) += debug.o
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
new file mode 100644
index 000000000000..d9e194b9aead
--- /dev/null
+++ b/kernel/cgroup/drm.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: MIT
+// Copyright 2018 Advanced Micro Devices, Inc.
+#include <linux/slab.h>
+#include <linux/cgroup.h>
+#include <linux/cgroup_drm.h>
+
+static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
+					struct cftype *cft)
+{
+	return 88;
+}
+
+static void drmcgrp_css_free(struct cgroup_subsys_state *css)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(css);
+
+	kfree(css_drmcgrp(css));
+}
+
+static struct cgroup_subsys_state *
+drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
+{
+	struct drmcgrp *drmcgrp;
+
+	drmcgrp = kzalloc(sizeof(struct drmcgrp), GFP_KERNEL);
+	if (!drmcgrp)
+		return ERR_PTR(-ENOMEM);
+
+	return &drmcgrp->css;
+}
+
+struct cftype files[] = {
+	{
+		.name = "drm_test",
+		.read_u64 = drmcgrp_test_read,
+	},
+	{ }	/* terminate */
+};
+
+struct cgroup_subsys drm_cgrp_subsys = {
+	.css_alloc	= drmcgrp_css_alloc,
+	.css_free	= drmcgrp_css_free,
+	.early_init	= false,
+	.legacy_cftypes	= files,
+	.dfl_cftypes	= files,
+};
-- 
2.19.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found] ` <20181120185814.13362-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-20 18:58   ` Kenny Ho
       [not found]     ` <20181120185814.13362-3-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-20 18:58   ` [PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices Kenny Ho
  2018-11-20 18:58   ` [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup Kenny Ho
  2 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2018-11-20 18:58 UTC (permalink / raw)
  To: y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Since many parts of the DRM subsystem has vendor-specific
implementations, we introduce mechanisms for vendor to register their
specific resources and control files to the DRM cgroup subsystem.  A
vendor will register itself with the DRM cgroup subsystem first before
registering individual DRM devices to the cgroup subsystem.

In addition to the cgroup_subsys_state that is common to all DRM
devices, a device-specific state is introduced and it is allocated
according to the vendor of the device.

Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 include/drm/drm_cgroup.h      | 39 ++++++++++++++++
 include/drm/drmcgrp_vendors.h |  7 +++
 include/linux/cgroup_drm.h    | 26 +++++++++++
 kernel/cgroup/drm.c           | 84 +++++++++++++++++++++++++++++++++++
 4 files changed, 156 insertions(+)
 create mode 100644 include/drm/drm_cgroup.h
 create mode 100644 include/drm/drmcgrp_vendors.h

diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
new file mode 100644
index 000000000000..26cbea7059a6
--- /dev/null
+++ b/include/drm/drm_cgroup.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#ifndef __DRM_CGROUP_H__
+#define __DRM_CGROUP_H__
+
+#define DRMCGRP_VENDOR(_x) _x ## _drmcgrp_vendor_id,
+enum drmcgrp_vendor_id {
+#include <drm/drmcgrp_vendors.h>
+	DRMCGRP_VENDOR_COUNT,
+};
+#undef DRMCGRP_VENDOR
+
+#define DRMCGRP_VENDOR(_x) extern struct drmcgrp_vendor _x ## _drmcgrp_vendor;
+#include <drm/drmcgrp_vendors.h>
+#undef DRMCGRP_VENDOR
+
+
+
+#ifdef CONFIG_CGROUP_DRM
+
+extern struct drmcgrp_vendor *drmcgrp_vendors[];
+
+int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id);
+int drmcgrp_register_device(struct drm_device *device, enum drmcgrp_vendor_id id);
+
+#else
+static int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id)
+{
+	return 0;
+}
+
+static int drmcgrp_register_device(struct drm_device *device, enum drmcgrp_vendor_id id)
+{
+	return 0;
+}
+
+#endif /* CONFIG_CGROUP_DRM */
+#endif /* __DRM_CGROUP_H__ */
diff --git a/include/drm/drmcgrp_vendors.h b/include/drm/drmcgrp_vendors.h
new file mode 100644
index 000000000000..b04d8649851b
--- /dev/null
+++ b/include/drm/drmcgrp_vendors.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#if IS_ENABLED(CONFIG_CGROUP_DRM)
+
+
+#endif
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 79ab38b0f46d..a776662d9593 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -6,10 +6,36 @@
 
 #ifdef CONFIG_CGROUP_DRM
 
+#include <linux/mutex.h>
 #include <linux/cgroup.h>
+#include <drm/drm_file.h>
+#include <drm/drm_cgroup.h>
+
+/* limit defined per the way drm_minor_alloc operates */
+#define MAX_DRM_DEV (64 * DRM_MINOR_RENDER)
+
+struct drmcgrp_device {
+	enum drmcgrp_vendor_id	vid;
+	struct drm_device	*dev;
+	struct mutex		mutex;
+};
+
+/* vendor-common resource counting goes here */
+/* this struct should be included in the vendor specific resource */
+struct drmcgrp_device_resource {
+	struct drmcgrp_device	*ddev;
+};
+
+struct drmcgrp_vendor {
+	struct cftype *(*get_cftypes)(void);
+	struct drmcgrp_device_resource *(*alloc_dev_resource)(void);
+	void (*free_dev_resource)(struct drmcgrp_device_resource *dev_resource);
+};
+
 
 struct drmcgrp {
 	struct cgroup_subsys_state	css;
+	struct drmcgrp_device_resource	*dev_resources[MAX_DRM_DEV];
 };
 
 static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index d9e194b9aead..f9630cc389bc 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -1,8 +1,30 @@
 // SPDX-License-Identifier: MIT
 // Copyright 2018 Advanced Micro Devices, Inc.
+#include <linux/export.h>
 #include <linux/slab.h>
 #include <linux/cgroup.h>
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+#include <linux/mutex.h>
 #include <linux/cgroup_drm.h>
+#include <drm/drm_device.h>
+#include <drm/drm_cgroup.h>
+
+/* generate an array of drm cgroup vendor pointers */
+#define DRMCGRP_VENDOR(_x)[_x ## _drmcgrp_vendor_id] = NULL,
+struct drmcgrp_vendor *drmcgrp_vendors[] = {
+#include <drm/drmcgrp_vendors.h>
+};
+#undef DRMCGRP_VENDOR
+EXPORT_SYMBOL(drmcgrp_vendors);
+
+static DEFINE_MUTEX(drmcgrp_mutex);
+
+/* indexed by drm_minor for access speed */
+static struct drmcgrp_device	*known_drmcgrp_devs[MAX_DRM_DEV];
+
+static int max_minor;
+
 
 static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
 					struct cftype *cft)
@@ -13,6 +35,12 @@ static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
 static void drmcgrp_css_free(struct cgroup_subsys_state *css)
 {
 	struct drmcgrp *drmcgrp = css_drmcgrp(css);
+	int i;
+
+	for (i = 0; i <= max_minor; i++) {
+		if (drmcgrp->dev_resources[i] != NULL)
+			drmcgrp_vendors[known_drmcgrp_devs[i]->vid]->free_dev_resource(drmcgrp->dev_resources[i]);
+	}
 
 	kfree(css_drmcgrp(css));
 }
@@ -21,11 +49,27 @@ static struct cgroup_subsys_state *
 drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct drmcgrp *drmcgrp;
+	int i;
 
 	drmcgrp = kzalloc(sizeof(struct drmcgrp), GFP_KERNEL);
 	if (!drmcgrp)
 		return ERR_PTR(-ENOMEM);
 
+	for (i = 0; i <= max_minor; i++) {
+		if (known_drmcgrp_devs[i] != NULL) {
+			struct drmcgrp_device_resource *ddr =
+				drmcgrp_vendors[known_drmcgrp_devs[i]->vid]->alloc_dev_resource();
+
+			if (IS_ERR(ddr)) {
+				drmcgrp_css_free(&drmcgrp->css);
+				return ERR_PTR(-ENOMEM);
+			}
+
+			drmcgrp->dev_resources[i] = ddr;
+			drmcgrp->dev_resources[i]->ddev = known_drmcgrp_devs[i];
+		}
+	}
+
 	return &drmcgrp->css;
 }
 
@@ -44,3 +88,43 @@ struct cgroup_subsys drm_cgrp_subsys = {
 	.legacy_cftypes	= files,
 	.dfl_cftypes	= files,
 };
+
+int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id)
+{
+	int rc = 0;
+	struct cftype *cfts;
+
+	// TODO: root css created before any registration
+	if (drmcgrp_vendors[id] == NULL) {
+		drmcgrp_vendors[id] = vendor;
+		cfts = vendor->get_cftypes();
+		if (cfts != NULL)
+			rc = cgroup_add_legacy_cftypes(&drm_cgrp_subsys, cfts);
+	}
+	return rc;
+}
+EXPORT_SYMBOL(drmcgrp_register_vendor);
+
+
+int drmcgrp_register_device(struct drm_device *dev, enum drmcgrp_vendor_id id)
+{
+	struct drmcgrp_device *ddev;
+
+	ddev = kzalloc(sizeof(struct drmcgrp_device), GFP_KERNEL);
+	if (!ddev)
+		return -ENOMEM;
+
+	mutex_lock(&drmcgrp_mutex);
+
+	ddev->vid = id;
+	ddev->dev = dev;
+	mutex_init(&ddev->mutex);
+
+	known_drmcgrp_devs[dev->primary->index] = ddev;
+
+	max_minor = max(max_minor, dev->primary->index);
+
+	mutex_unlock(&drmcgrp_mutex);
+	return 0;
+}
+EXPORT_SYMBOL(drmcgrp_register_device);
-- 
2.19.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices
       [not found] ` <20181120185814.13362-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-20 18:58   ` [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices Kenny Ho
@ 2018-11-20 18:58   ` Kenny Ho
       [not found]     ` <20181120185814.13362-4-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-20 18:58   ` [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup Kenny Ho
  2 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2018-11-20 18:58 UTC (permalink / raw)
  To: y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Change-Id: Ib66c44ac1b1c367659e362a2fc05b6fbb3805876
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/Makefile         |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  7 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 37 +++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h | 19 +++++++++++
 include/drm/drmcgrp_vendors.h               |  1 +
 5 files changed, 67 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index 138cb787d27e..5cf8048f2d75 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -186,4 +186,7 @@ amdgpu-y += $(AMD_DISPLAY_FILES)
 
 endif
 
+#DRM cgroup controller
+amdgpu-y += amdgpu_drmcgrp.o
+
 obj-$(CONFIG_DRM_AMDGPU)+= amdgpu.o
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 30bc345d6fdf..ad0373f83ed3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -33,6 +33,7 @@
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_cgroup.h>
 #include <linux/vgaarb.h>
 #include <linux/vga_switcheroo.h>
 #include <linux/efi.h>
@@ -2645,6 +2646,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 		goto failed;
 	}
 
+	/* TODO:docs */
+	if (drmcgrp_vendors[amd_drmcgrp_vendor_id] == NULL)
+		drmcgrp_register_vendor(&amd_drmcgrp_vendor, amd_drmcgrp_vendor_id);
+
+	drmcgrp_register_device(adev->ddev, amd_drmcgrp_vendor_id);
+
 	return 0;
 
 failed:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
new file mode 100644
index 000000000000..ed8aac17769c
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: MIT
+// Copyright 2018 Advanced Micro Devices, Inc.
+#include <linux/slab.h>
+#include <linux/cgroup_drm.h>
+#include <drm/drm_device.h>
+#include "amdgpu_drmcgrp.h"
+
+struct cftype files[] = {
+	{ } /* terminate */
+};
+
+struct cftype *drmcgrp_amd_get_cftypes(void)
+{
+	return files;
+}
+
+struct drmcgrp_device_resource *amd_drmcgrp_alloc_dev_resource(void)
+{
+	struct amd_drmcgrp_dev_resource *a_ddr;
+
+	a_ddr = kzalloc(sizeof(struct amd_drmcgrp_dev_resource), GFP_KERNEL);
+	if (!a_ddr)
+		return ERR_PTR(-ENOMEM);
+
+	return &a_ddr->ddr;
+}
+
+void amd_drmcgrp_free_dev_resource(struct drmcgrp_device_resource *ddr)
+{
+	kfree(ddr_amdddr(ddr));
+}
+
+struct drmcgrp_vendor amd_drmcgrp_vendor = {
+	.get_cftypes = drmcgrp_amd_get_cftypes,
+	.alloc_dev_resource = amd_drmcgrp_alloc_dev_resource,
+	.free_dev_resource = amd_drmcgrp_free_dev_resource,
+};
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
new file mode 100644
index 000000000000..e2934b7a49f5
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#ifndef _AMDGPU_DRMCGRP_H
+#define _AMDGPU_DRMCGRP_H
+
+#include <linux/cgroup_drm.h>
+
+/* for AMD specific DRM resources */
+struct amd_drmcgrp_dev_resource {
+	struct drmcgrp_device_resource ddr;
+};
+
+static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct drmcgrp_device_resource *ddr)
+{
+	return ddr ? container_of(ddr, struct amd_drmcgrp_dev_resource, ddr) : NULL;
+}
+
+#endif	/* _AMDGPU_DRMCGRP_H */
diff --git a/include/drm/drmcgrp_vendors.h b/include/drm/drmcgrp_vendors.h
index b04d8649851b..6cfbf1825344 100644
--- a/include/drm/drmcgrp_vendors.h
+++ b/include/drm/drmcgrp_vendors.h
@@ -3,5 +3,6 @@
  */
 #if IS_ENABLED(CONFIG_CGROUP_DRM)
 
+DRMCGRP_VENDOR(amd)
 
 #endif
-- 
2.19.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
       [not found] ` <20181120185814.13362-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-20 18:58   ` [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices Kenny Ho
  2018-11-20 18:58   ` [PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices Kenny Ho
@ 2018-11-20 18:58   ` Kenny Ho
  2018-11-20 20:57       ` Eric Anholt
  2018-11-21  9:58     ` Christian König
  2 siblings, 2 replies; 80+ messages in thread
From: Kenny Ho @ 2018-11-20 18:58 UTC (permalink / raw)
  To: y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Account for the number of command submitted to amdgpu by type on a per
cgroup basis, for the purpose of profiling/monitoring applications.

x prefix in the control file name x.cmd_submitted.amd.stat signify
experimental.

Change-Id: Ibc22e5bda600f54fe820fe0af5400ca348691550
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |  5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 54 +++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c    | 15 ++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |  5 +-
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 663043c8f0f5..b448160aed89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -33,6 +33,7 @@
 #include "amdgpu_trace.h"
 #include "amdgpu_gmc.h"
 #include "amdgpu_gem.h"
+#include "amdgpu_drmcgrp.h"
 
 static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
 				      struct drm_amdgpu_cs_chunk_fence *data,
@@ -1275,6 +1276,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 	union drm_amdgpu_cs *cs = data;
 	struct amdgpu_cs_parser parser = {};
 	bool reserved_buffers = false;
+	struct amdgpu_ring *ring;
 	int i, r;
 
 	if (!adev->accel_working)
@@ -1317,6 +1319,9 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 	if (r)
 		goto out;
 
+	ring = to_amdgpu_ring(parser.entity->rq->sched);
+	amdgpu_drmcgrp_count_cs(current, dev, ring->funcs->type);
+
 	r = amdgpu_cs_submit(&parser, cs);
 
 out:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
index ed8aac17769c..853b77532428 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
@@ -1,11 +1,65 @@
 // SPDX-License-Identifier: MIT
 // Copyright 2018 Advanced Micro Devices, Inc.
 #include <linux/slab.h>
+#include <linux/mutex.h>
 #include <linux/cgroup_drm.h>
 #include <drm/drm_device.h>
+#include "amdgpu_ring.h"
 #include "amdgpu_drmcgrp.h"
 
+void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
+		enum amdgpu_ring_type r_type)
+{
+	struct drmcgrp *drmcgrp = get_drmcgrp(task);
+	struct drmcgrp_device_resource *ddr;
+	struct drmcgrp *p;
+	struct amd_drmcgrp_dev_resource *a_ddr;
+
+	if (drmcgrp == NULL)
+		return;
+
+	ddr = drmcgrp->dev_resources[dev->primary->index];
+
+	mutex_lock(&ddr->ddev->mutex);
+	for (p = drmcgrp; p != NULL; p = parent_drmcgrp(drmcgrp)) {
+		a_ddr = ddr_amdddr(p->dev_resources[dev->primary->index]);
+
+		a_ddr->cs_count[r_type]++;
+	}
+	mutex_unlock(&ddr->ddev->mutex);
+}
+
+int amd_drmcgrp_cmd_submit_accounting_read(struct seq_file *sf, void *v)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
+	struct drmcgrp_device_resource *ddr = NULL;
+	struct amd_drmcgrp_dev_resource *a_ddr = NULL;
+	int i, j;
+
+	seq_puts(sf, "---\n");
+	for (i = 0; i < MAX_DRM_DEV; i++) {
+		ddr = drmcgrp->dev_resources[i];
+
+		if (ddr == NULL || ddr->ddev->vid != amd_drmcgrp_vendor_id)
+			continue;
+
+		a_ddr = ddr_amdddr(ddr);
+
+		seq_printf(sf, "card%d:\n", i);
+		for (j = 0; j < __MAX_AMDGPU_RING_TYPE; j++)
+			seq_printf(sf, "  %s: %llu\n", amdgpu_ring_names[j], a_ddr->cs_count[j]);
+	}
+
+	return 0;
+}
+
+
 struct cftype files[] = {
+	{
+		.name = "x.cmd_submitted.amd.stat",
+		.seq_show = amd_drmcgrp_cmd_submit_accounting_read,
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
 	{ } /* terminate */
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
index e2934b7a49f5..f894a9a1059f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
@@ -5,12 +5,17 @@
 #define _AMDGPU_DRMCGRP_H
 
 #include <linux/cgroup_drm.h>
+#include "amdgpu_ring.h"
 
 /* for AMD specific DRM resources */
 struct amd_drmcgrp_dev_resource {
 	struct drmcgrp_device_resource ddr;
+	u64 cs_count[__MAX_AMDGPU_RING_TYPE];
 };
 
+void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
+		enum amdgpu_ring_type r_type);
+
 static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct drmcgrp_device_resource *ddr)
 {
 	return ddr ? container_of(ddr, struct amd_drmcgrp_dev_resource, ddr) : NULL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index b70e85ec147d..1606f84d2334 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -34,6 +34,21 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+char const *amdgpu_ring_names[] = {
+	[AMDGPU_RING_TYPE_GFX]		= "gfx",
+	[AMDGPU_RING_TYPE_COMPUTE]	= "compute",
+	[AMDGPU_RING_TYPE_SDMA]		= "sdma",
+	[AMDGPU_RING_TYPE_UVD]		= "uvd",
+	[AMDGPU_RING_TYPE_VCE]		= "vce",
+	[AMDGPU_RING_TYPE_KIQ]		= "kiq",
+	[AMDGPU_RING_TYPE_UVD_ENC]	= "uvd_enc",
+	[AMDGPU_RING_TYPE_VCN_DEC]	= "vcn_dec",
+	[AMDGPU_RING_TYPE_VCN_ENC]	= "vcn_enc",
+	[AMDGPU_RING_TYPE_VCN_JPEG]	= "vcn_jpeg",
+	[__MAX_AMDGPU_RING_TYPE]	= "_max"
+
+};
+
 /*
  * Rings
  * Most engines on the GPU are fed via ring buffers.  Ring
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 4caa301ce454..e292b7e89c6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -56,9 +56,12 @@ enum amdgpu_ring_type {
 	AMDGPU_RING_TYPE_UVD_ENC,
 	AMDGPU_RING_TYPE_VCN_DEC,
 	AMDGPU_RING_TYPE_VCN_ENC,
-	AMDGPU_RING_TYPE_VCN_JPEG
+	AMDGPU_RING_TYPE_VCN_JPEG,
+	__MAX_AMDGPU_RING_TYPE
 };
 
+extern char const *amdgpu_ring_names[];
+
 struct amdgpu_device;
 struct amdgpu_ring;
 struct amdgpu_ib;
-- 
2.19.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
  2018-11-20 18:58 [PATCH RFC 0/5] DRM cgroup controller Kenny Ho
  2018-11-20 18:58 ` [PATCH RFC 1/5] cgroup: Introduce cgroup for drm subsystem Kenny Ho
       [not found] ` <20181120185814.13362-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-20 18:58 ` Kenny Ho
       [not found]   ` <20181120185814.13362-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-21  1:43 ` ✗ Fi.CI.BAT: failure for DRM cgroup controller Patchwork
  2019-05-09 21:04 ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Kenny Ho
  4 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2018-11-20 18:58 UTC (permalink / raw)
  To: y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, intel-gfx

Account for the total size of buffer object requested to amdgpu by
buffer type on a per cgroup basis.

x prefix in the control file name x.bo_requested.amd.stat signify
experimental.

Change-Id: Ifb680c4bcf3652879a7a659510e25680c2465cf6
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 56 +++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c     | 13 +++++
 include/uapi/drm/amdgpu_drm.h               | 24 ++++++---
 4 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
index 853b77532428..e3d98ed01b79 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
@@ -7,6 +7,57 @@
 #include "amdgpu_ring.h"
 #include "amdgpu_drmcgrp.h"
 
+void amdgpu_drmcgrp_count_bo_req(struct task_struct *task, struct drm_device *dev,
+		u32 domain, unsigned long size)
+{
+	struct drmcgrp *drmcgrp = get_drmcgrp(task);
+	struct drmcgrp_device_resource *ddr;
+	struct drmcgrp *p;
+	struct amd_drmcgrp_dev_resource *a_ddr;
+        int i;
+
+	if (drmcgrp == NULL)
+		return;
+
+	ddr = drmcgrp->dev_resources[dev->primary->index];
+
+	mutex_lock(&ddr->ddev->mutex);
+	for (p = drmcgrp; p != NULL; p = parent_drmcgrp(drmcgrp)) {
+		a_ddr = ddr_amdddr(p->dev_resources[dev->primary->index]);
+
+		for (i = 0; i < __MAX_AMDGPU_MEM_DOMAIN; i++)
+			if ( (1 << i) & domain)
+				a_ddr->bo_req_count[i] += size;
+	}
+	mutex_unlock(&ddr->ddev->mutex);
+}
+
+int amd_drmcgrp_bo_req_stat_read(struct seq_file *sf, void *v)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
+	struct drmcgrp_device_resource *ddr = NULL;
+	struct amd_drmcgrp_dev_resource *a_ddr = NULL;
+	int i, j;
+
+	seq_puts(sf, "---\n");
+	for (i = 0; i < MAX_DRM_DEV; i++) {
+		ddr = drmcgrp->dev_resources[i];
+
+		if (ddr == NULL || ddr->ddev->vid != amd_drmcgrp_vendor_id)
+			continue;
+
+		a_ddr = ddr_amdddr(ddr);
+
+		seq_printf(sf, "card%d:\n", i);
+		for (j = 0; j < __MAX_AMDGPU_MEM_DOMAIN; j++)
+			seq_printf(sf, "  %s: %llu\n", amdgpu_mem_domain_names[j], a_ddr->bo_req_count[j]);
+	}
+
+	return 0;
+}
+
+
+
 void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
 		enum amdgpu_ring_type r_type)
 {
@@ -55,6 +106,11 @@ int amd_drmcgrp_cmd_submit_accounting_read(struct seq_file *sf, void *v)
 
 
 struct cftype files[] = {
+	{
+		.name = "x.bo_requested.amd.stat",
+		.seq_show = amd_drmcgrp_bo_req_stat_read,
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
 	{
 		.name = "x.cmd_submitted.amd.stat",
 		.seq_show = amd_drmcgrp_cmd_submit_accounting_read,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
index f894a9a1059f..8b9d61e47dde 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
@@ -11,10 +11,13 @@
 struct amd_drmcgrp_dev_resource {
 	struct drmcgrp_device_resource ddr;
 	u64 cs_count[__MAX_AMDGPU_RING_TYPE];
+	u64 bo_req_count[__MAX_AMDGPU_MEM_DOMAIN];
 };
 
 void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
 		enum amdgpu_ring_type r_type);
+void amdgpu_drmcgrp_count_bo_req(struct task_struct *task, struct drm_device *dev,
+		u32 domain, unsigned long size);
 
 static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct drmcgrp_device_resource *ddr)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 7b3d1ebda9df..339e1d3edad8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -31,6 +31,17 @@
 #include <drm/amdgpu_drm.h>
 #include "amdgpu.h"
 #include "amdgpu_display.h"
+#include "amdgpu_drmcgrp.h"
+
+char const *amdgpu_mem_domain_names[] = {
+	[AMDGPU_MEM_DOMAIN_CPU]		= "cpu",
+	[AMDGPU_MEM_DOMAIN_GTT]		= "gtt",
+	[AMDGPU_MEM_DOMAIN_VRAM]	= "vram",
+	[AMDGPU_MEM_DOMAIN_GDS]		= "gds",
+	[AMDGPU_MEM_DOMAIN_GWS]		= "gws",
+	[AMDGPU_MEM_DOMAIN_OA]		= "oa",
+	[__MAX_AMDGPU_MEM_DOMAIN]	= "_max"
+};
 
 void amdgpu_gem_object_free(struct drm_gem_object *gobj)
 {
@@ -52,6 +63,8 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
 	struct amdgpu_bo_param bp;
 	int r;
 
+	amdgpu_drmcgrp_count_bo_req(current, adev->ddev, initial_domain, size);
+
 	memset(&bp, 0, sizeof(bp));
 	*obj = NULL;
 	/* At least align on page size */
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index 370e9a5536ef..531726443104 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -72,6 +72,18 @@ extern "C" {
 #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
 #define DRM_IOCTL_AMDGPU_SCHED		DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
 
+enum amdgpu_mem_domain {
+	AMDGPU_MEM_DOMAIN_CPU,
+	AMDGPU_MEM_DOMAIN_GTT,
+	AMDGPU_MEM_DOMAIN_VRAM,
+	AMDGPU_MEM_DOMAIN_GDS,
+	AMDGPU_MEM_DOMAIN_GWS,
+	AMDGPU_MEM_DOMAIN_OA,
+	__MAX_AMDGPU_MEM_DOMAIN
+};
+
+extern char const *amdgpu_mem_domain_names[];
+
 /**
  * DOC: memory domains
  *
@@ -95,12 +107,12 @@ extern "C" {
  * %AMDGPU_GEM_DOMAIN_OA	Ordered append, used by 3D or Compute engines
  * for appending data.
  */
-#define AMDGPU_GEM_DOMAIN_CPU		0x1
-#define AMDGPU_GEM_DOMAIN_GTT		0x2
-#define AMDGPU_GEM_DOMAIN_VRAM		0x4
-#define AMDGPU_GEM_DOMAIN_GDS		0x8
-#define AMDGPU_GEM_DOMAIN_GWS		0x10
-#define AMDGPU_GEM_DOMAIN_OA		0x20
+#define AMDGPU_GEM_DOMAIN_CPU		(1 << AMDGPU_MEM_DOMAIN_CPU)
+#define AMDGPU_GEM_DOMAIN_GTT		(1 << AMDGPU_MEM_DOMAIN_GTT)
+#define AMDGPU_GEM_DOMAIN_VRAM		(1 << AMDGPU_MEM_DOMAIN_VRAM)
+#define AMDGPU_GEM_DOMAIN_GDS		(1 << AMDGPU_MEM_DOMAIN_GDS)
+#define AMDGPU_GEM_DOMAIN_GWS		(1 << AMDGPU_MEM_DOMAIN_GWS)
+#define AMDGPU_GEM_DOMAIN_OA		(1 << AMDGPU_MEM_DOMAIN_OA)
 #define AMDGPU_GEM_DOMAIN_MASK		(AMDGPU_GEM_DOMAIN_CPU | \
 					 AMDGPU_GEM_DOMAIN_GTT | \
 					 AMDGPU_GEM_DOMAIN_VRAM | \
-- 
2.19.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]     ` <20181120185814.13362-3-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-20 20:21       ` Tejun Heo
       [not found]         ` <20181120202141.GA2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
  2018-11-21  9:53       ` Christian König
  1 sibling, 1 reply; 80+ messages in thread
From: Tejun Heo @ 2018-11-20 20:21 UTC (permalink / raw)
  To: Kenny Ho
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hello,

On Tue, Nov 20, 2018 at 01:58:11PM -0500, Kenny Ho wrote:
> Since many parts of the DRM subsystem has vendor-specific
> implementations, we introduce mechanisms for vendor to register their
> specific resources and control files to the DRM cgroup subsystem.  A
> vendor will register itself with the DRM cgroup subsystem first before
> registering individual DRM devices to the cgroup subsystem.
> 
> In addition to the cgroup_subsys_state that is common to all DRM
> devices, a device-specific state is introduced and it is allocated
> according to the vendor of the device.

So, I'm still pretty negative about adding drm controller at this
point.  There isn't enough of common resource model defined yet and
until that gets sorted out I think it's in the best interest of
everyone involved to keep it inside drm or specific driver proper.

Thanks.

-- 
tejun
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
       [not found]   ` <20181120185814.13362-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-20 20:56       ` Eric Anholt
  2018-11-21 10:00     ` Christian König
  1 sibling, 0 replies; 80+ messages in thread
From: Eric Anholt @ 2018-11-20 20:56 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w


[-- Attachment #1.1: Type: text/plain, Size: 453 bytes --]

Kenny Ho <Kenny.Ho-5C7GfCeVMHo@public.gmane.org> writes:

> Account for the total size of buffer object requested to amdgpu by
> buffer type on a per cgroup basis.
>
> x prefix in the control file name x.bo_requested.amd.stat signify
> experimental.

Why is a counting of the size of buffer objects ever allocated useful,
as opposed to the current size of buffer objects allocated?

And, really, why is this stat in cgroups, instead of a debugfs entry?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
@ 2018-11-20 20:56       ` Eric Anholt
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Anholt @ 2018-11-20 20:56 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 453 bytes --]

Kenny Ho <Kenny.Ho-5C7GfCeVMHo@public.gmane.org> writes:

> Account for the total size of buffer object requested to amdgpu by
> buffer type on a per cgroup basis.
>
> x prefix in the control file name x.bo_requested.amd.stat signify
> experimental.

Why is a counting of the size of buffer objects ever allocated useful,
as opposed to the current size of buffer objects allocated?

And, really, why is this stat in cgroups, instead of a debugfs entry?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
  2018-11-20 18:58   ` [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup Kenny Ho
@ 2018-11-20 20:57       ` Eric Anholt
  2018-11-21  9:58     ` Christian König
  1 sibling, 0 replies; 80+ messages in thread
From: Eric Anholt @ 2018-11-20 20:57 UTC (permalink / raw)
  To: Kenny Ho, y2kenny


[-- Attachment #1.1: Type: text/plain, Size: 365 bytes --]

Kenny Ho <Kenny.Ho@amd.com> writes:

> Account for the number of command submitted to amdgpu by type on a per
> cgroup basis, for the purpose of profiling/monitoring applications.

For profiling other drivers, I've used perf tracepoints, which let you
get useful timelines of multiple events in the driver.  Have you made
use of this stat for productive profiling?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
@ 2018-11-20 20:57       ` Eric Anholt
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Anholt @ 2018-11-20 20:57 UTC (permalink / raw)
  To: Kenny Ho, y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 365 bytes --]

Kenny Ho <Kenny.Ho@amd.com> writes:

> Account for the number of command submitted to amdgpu by type on a per
> cgroup basis, for the purpose of profiling/monitoring applications.

For profiling other drivers, I've used perf tracepoints, which let you
get useful timelines of multiple events in the driver.  Have you made
use of this stat for productive profiling?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]         ` <20181120202141.GA2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
@ 2018-11-20 22:21           ` Ho, Kenny
       [not found]             ` <DM5PR12MB1226E972538A45325114ADF683D90-2J9CzHegvk+lTFawYev2gQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Ho, Kenny @ 2018-11-20 22:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Tejun,

Thanks for the reply.  A few clarifying questions:

On Tue, Nov 20, 2018 at 3:21 PM Tejun Heo <tj@kernel.org> wrote:
> So, I'm still pretty negative about adding drm controller at this
> point.  There isn't enough of common resource model defined yet and
> until that gets sorted out I think it's in the best interest of
> everyone involved to keep it inside drm or specific driver proper.
By this reply, are you suggesting that vendor specific resources will never be acceptable to be managed under cgroup?  Let say a user want to have similar functionality as what cgroup is offering but to manage vendor specific resources, what would you suggest as a solution?  When you say keeping vendor specific resource regulation inside drm or specific drivers, do you mean we should replicate the cgroup infrastructure there or do you mean either drm or specific driver should query existing hierarchy (such as device or perhaps cpu) for the process organization information?

To put the questions in more concrete terms, let say a user wants to expose certain part of a gpu to a particular cgroup similar to the way selective cpu cores are exposed to a cgroup via cpuset, how should we go about enabling such functionality?

Regards,
Kenny
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]             ` <DM5PR12MB1226E972538A45325114ADF683D90-2J9CzHegvk+lTFawYev2gQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-11-20 22:30               ` Tejun Heo
       [not found]                 ` <20181120223018.GB2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Tejun Heo @ 2018-11-20 22:30 UTC (permalink / raw)
  To: Ho, Kenny
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hello,

On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> By this reply, are you suggesting that vendor specific resources
> will never be acceptable to be managed under cgroup?  Let say a user

I wouldn't say never but whatever which gets included as a cgroup
controller should have clearly defined resource abstractions and the
control schemes around them including support for delegation.  AFAICS,
gpu side still seems to have a long way to go (and it's not clear
whether that's somewhere it will or needs to end up).

> want to have similar functionality as what cgroup is offering but to
> manage vendor specific resources, what would you suggest as a
> solution?  When you say keeping vendor specific resource regulation
> inside drm or specific drivers, do you mean we should replicate the
> cgroup infrastructure there or do you mean either drm or specific
> driver should query existing hierarchy (such as device or perhaps
> cpu) for the process organization information?
> 
> To put the questions in more concrete terms, let say a user wants to
> expose certain part of a gpu to a particular cgroup similar to the
> way selective cpu cores are exposed to a cgroup via cpuset, how
> should we go about enabling such functionality?

Do what the intel driver or bpf is doing?  It's not difficult to hook
into cgroup for identification purposes.

Thanks.

-- 
tejun
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* ✗ Fi.CI.BAT: failure for DRM cgroup controller
  2018-11-20 18:58 [PATCH RFC 0/5] DRM cgroup controller Kenny Ho
                   ` (2 preceding siblings ...)
  2018-11-20 18:58 ` [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request " Kenny Ho
@ 2018-11-21  1:43 ` Patchwork
  2019-05-09 21:04 ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Kenny Ho
  4 siblings, 0 replies; 80+ messages in thread
From: Patchwork @ 2018-11-21  1:43 UTC (permalink / raw)
  To: Kenny Ho; +Cc: intel-gfx

== Series Details ==

Series: DRM cgroup controller
URL   : https://patchwork.freedesktop.org/series/52799/
State : failure

== Summary ==

CALL    scripts/checksyscalls.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_device.o
In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:36:0:
./include/drm/drm_cgroup.h:28:43: warning: ‘struct drmcgrp_vendor’ declared inside parameter list will not be visible outside of this definition or declaration
 static int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id)
                                           ^~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_init’:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2653:6: error: ‘drmcgrp_vendors’ undeclared (first use in this function); did you mean ‘drmcgrp_vendor_id’?
  if (drmcgrp_vendors[amd_drmcgrp_vendor_id] == NULL)
      ^~~~~~~~~~~~~~~
      drmcgrp_vendor_id
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2653:6: note: each undeclared identifier is reported only once for each function it appears in
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2653:22: error: ‘amd_drmcgrp_vendor_id’ undeclared (first use in this function); did you mean ‘drmcgrp_vendor_id’?
  if (drmcgrp_vendors[amd_drmcgrp_vendor_id] == NULL)
                      ^~~~~~~~~~~~~~~~~~~~~
                      drmcgrp_vendor_id
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2654:28: error: ‘amd_drmcgrp_vendor’ undeclared (first use in this function); did you mean ‘amd_drmcgrp_vendor_id’?
   drmcgrp_register_vendor(&amd_drmcgrp_vendor, amd_drmcgrp_vendor_id);
                            ^~~~~~~~~~~~~~~~~~
                            amd_drmcgrp_vendor_id
scripts/Makefile.build:293: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_device.o' failed
make[4]: *** [drivers/gpu/drm/amd/amdgpu/amdgpu_device.o] Error 1
scripts/Makefile.build:518: recipe for target 'drivers/gpu/drm/amd/amdgpu' failed
make[3]: *** [drivers/gpu/drm/amd/amdgpu] Error 2
scripts/Makefile.build:518: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:518: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1060: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]     ` <20181120185814.13362-3-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-20 20:21       ` Tejun Heo
@ 2018-11-21  9:53       ` Christian König
  1 sibling, 0 replies; 80+ messages in thread
From: Christian König @ 2018-11-21  9:53 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 20.11.18 um 19:58 schrieb Kenny Ho:
> Since many parts of the DRM subsystem has vendor-specific
> implementations, we introduce mechanisms for vendor to register their
> specific resources and control files to the DRM cgroup subsystem.  A
> vendor will register itself with the DRM cgroup subsystem first before
> registering individual DRM devices to the cgroup subsystem.
>
> In addition to the cgroup_subsys_state that is common to all DRM
> devices, a device-specific state is introduced and it is allocated
> according to the vendor of the device.

Mhm, it's most likely just a naming issue but I think we should drop the 
term "vendor" here and rather use "driver" instead.

Background is that both Intel as well as AMD have multiple drivers for 
different hardware generations and we certainly don't want to handle all 
drivers from one vendor the same way.

Christian.

>
> Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>   include/drm/drm_cgroup.h      | 39 ++++++++++++++++
>   include/drm/drmcgrp_vendors.h |  7 +++
>   include/linux/cgroup_drm.h    | 26 +++++++++++
>   kernel/cgroup/drm.c           | 84 +++++++++++++++++++++++++++++++++++
>   4 files changed, 156 insertions(+)
>   create mode 100644 include/drm/drm_cgroup.h
>   create mode 100644 include/drm/drmcgrp_vendors.h
>
> diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
> new file mode 100644
> index 000000000000..26cbea7059a6
> --- /dev/null
> +++ b/include/drm/drm_cgroup.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: MIT
> + * Copyright 2018 Advanced Micro Devices, Inc.
> + */
> +#ifndef __DRM_CGROUP_H__
> +#define __DRM_CGROUP_H__
> +
> +#define DRMCGRP_VENDOR(_x) _x ## _drmcgrp_vendor_id,
> +enum drmcgrp_vendor_id {
> +#include <drm/drmcgrp_vendors.h>
> +	DRMCGRP_VENDOR_COUNT,
> +};
> +#undef DRMCGRP_VENDOR
> +
> +#define DRMCGRP_VENDOR(_x) extern struct drmcgrp_vendor _x ## _drmcgrp_vendor;
> +#include <drm/drmcgrp_vendors.h>
> +#undef DRMCGRP_VENDOR
> +
> +
> +
> +#ifdef CONFIG_CGROUP_DRM
> +
> +extern struct drmcgrp_vendor *drmcgrp_vendors[];
> +
> +int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id);
> +int drmcgrp_register_device(struct drm_device *device, enum drmcgrp_vendor_id id);
> +
> +#else
> +static int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id)
> +{
> +	return 0;
> +}
> +
> +static int drmcgrp_register_device(struct drm_device *device, enum drmcgrp_vendor_id id)
> +{
> +	return 0;
> +}
> +
> +#endif /* CONFIG_CGROUP_DRM */
> +#endif /* __DRM_CGROUP_H__ */
> diff --git a/include/drm/drmcgrp_vendors.h b/include/drm/drmcgrp_vendors.h
> new file mode 100644
> index 000000000000..b04d8649851b
> --- /dev/null
> +++ b/include/drm/drmcgrp_vendors.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: MIT
> + * Copyright 2018 Advanced Micro Devices, Inc.
> + */
> +#if IS_ENABLED(CONFIG_CGROUP_DRM)
> +
> +
> +#endif
> diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> index 79ab38b0f46d..a776662d9593 100644
> --- a/include/linux/cgroup_drm.h
> +++ b/include/linux/cgroup_drm.h
> @@ -6,10 +6,36 @@
>   
>   #ifdef CONFIG_CGROUP_DRM
>   
> +#include <linux/mutex.h>
>   #include <linux/cgroup.h>
> +#include <drm/drm_file.h>
> +#include <drm/drm_cgroup.h>
> +
> +/* limit defined per the way drm_minor_alloc operates */
> +#define MAX_DRM_DEV (64 * DRM_MINOR_RENDER)
> +
> +struct drmcgrp_device {
> +	enum drmcgrp_vendor_id	vid;
> +	struct drm_device	*dev;
> +	struct mutex		mutex;
> +};
> +
> +/* vendor-common resource counting goes here */
> +/* this struct should be included in the vendor specific resource */
> +struct drmcgrp_device_resource {
> +	struct drmcgrp_device	*ddev;
> +};
> +
> +struct drmcgrp_vendor {
> +	struct cftype *(*get_cftypes)(void);
> +	struct drmcgrp_device_resource *(*alloc_dev_resource)(void);
> +	void (*free_dev_resource)(struct drmcgrp_device_resource *dev_resource);
> +};
> +
>   
>   struct drmcgrp {
>   	struct cgroup_subsys_state	css;
> +	struct drmcgrp_device_resource	*dev_resources[MAX_DRM_DEV];
>   };
>   
>   static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
> diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> index d9e194b9aead..f9630cc389bc 100644
> --- a/kernel/cgroup/drm.c
> +++ b/kernel/cgroup/drm.c
> @@ -1,8 +1,30 @@
>   // SPDX-License-Identifier: MIT
>   // Copyright 2018 Advanced Micro Devices, Inc.
> +#include <linux/export.h>
>   #include <linux/slab.h>
>   #include <linux/cgroup.h>
> +#include <linux/fs.h>
> +#include <linux/seq_file.h>
> +#include <linux/mutex.h>
>   #include <linux/cgroup_drm.h>
> +#include <drm/drm_device.h>
> +#include <drm/drm_cgroup.h>
> +
> +/* generate an array of drm cgroup vendor pointers */
> +#define DRMCGRP_VENDOR(_x)[_x ## _drmcgrp_vendor_id] = NULL,
> +struct drmcgrp_vendor *drmcgrp_vendors[] = {
> +#include <drm/drmcgrp_vendors.h>
> +};
> +#undef DRMCGRP_VENDOR
> +EXPORT_SYMBOL(drmcgrp_vendors);
> +
> +static DEFINE_MUTEX(drmcgrp_mutex);
> +
> +/* indexed by drm_minor for access speed */
> +static struct drmcgrp_device	*known_drmcgrp_devs[MAX_DRM_DEV];
> +
> +static int max_minor;
> +
>   
>   static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
>   					struct cftype *cft)
> @@ -13,6 +35,12 @@ static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
>   static void drmcgrp_css_free(struct cgroup_subsys_state *css)
>   {
>   	struct drmcgrp *drmcgrp = css_drmcgrp(css);
> +	int i;
> +
> +	for (i = 0; i <= max_minor; i++) {
> +		if (drmcgrp->dev_resources[i] != NULL)
> +			drmcgrp_vendors[known_drmcgrp_devs[i]->vid]->free_dev_resource(drmcgrp->dev_resources[i]);
> +	}
>   
>   	kfree(css_drmcgrp(css));
>   }
> @@ -21,11 +49,27 @@ static struct cgroup_subsys_state *
>   drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
>   {
>   	struct drmcgrp *drmcgrp;
> +	int i;
>   
>   	drmcgrp = kzalloc(sizeof(struct drmcgrp), GFP_KERNEL);
>   	if (!drmcgrp)
>   		return ERR_PTR(-ENOMEM);
>   
> +	for (i = 0; i <= max_minor; i++) {
> +		if (known_drmcgrp_devs[i] != NULL) {
> +			struct drmcgrp_device_resource *ddr =
> +				drmcgrp_vendors[known_drmcgrp_devs[i]->vid]->alloc_dev_resource();
> +
> +			if (IS_ERR(ddr)) {
> +				drmcgrp_css_free(&drmcgrp->css);
> +				return ERR_PTR(-ENOMEM);
> +			}
> +
> +			drmcgrp->dev_resources[i] = ddr;
> +			drmcgrp->dev_resources[i]->ddev = known_drmcgrp_devs[i];
> +		}
> +	}
> +
>   	return &drmcgrp->css;
>   }
>   
> @@ -44,3 +88,43 @@ struct cgroup_subsys drm_cgrp_subsys = {
>   	.legacy_cftypes	= files,
>   	.dfl_cftypes	= files,
>   };
> +
> +int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum drmcgrp_vendor_id id)
> +{
> +	int rc = 0;
> +	struct cftype *cfts;
> +
> +	// TODO: root css created before any registration
> +	if (drmcgrp_vendors[id] == NULL) {
> +		drmcgrp_vendors[id] = vendor;
> +		cfts = vendor->get_cftypes();
> +		if (cfts != NULL)
> +			rc = cgroup_add_legacy_cftypes(&drm_cgrp_subsys, cfts);
> +	}
> +	return rc;
> +}
> +EXPORT_SYMBOL(drmcgrp_register_vendor);
> +
> +
> +int drmcgrp_register_device(struct drm_device *dev, enum drmcgrp_vendor_id id)
> +{
> +	struct drmcgrp_device *ddev;
> +
> +	ddev = kzalloc(sizeof(struct drmcgrp_device), GFP_KERNEL);
> +	if (!ddev)
> +		return -ENOMEM;
> +
> +	mutex_lock(&drmcgrp_mutex);
> +
> +	ddev->vid = id;
> +	ddev->dev = dev;
> +	mutex_init(&ddev->mutex);
> +
> +	known_drmcgrp_devs[dev->primary->index] = ddev;
> +
> +	max_minor = max(max_minor, dev->primary->index);
> +
> +	mutex_unlock(&drmcgrp_mutex);
> +	return 0;
> +}
> +EXPORT_SYMBOL(drmcgrp_register_device);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices
       [not found]     ` <20181120185814.13362-4-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-21  9:55       ` Christian König
  0 siblings, 0 replies; 80+ messages in thread
From: Christian König @ 2018-11-21  9:55 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 20.11.18 um 19:58 schrieb Kenny Ho:
> Change-Id: Ib66c44ac1b1c367659e362a2fc05b6fbb3805876
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/Makefile         |  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  7 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 37 +++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h | 19 +++++++++++
>   include/drm/drmcgrp_vendors.h               |  1 +
>   5 files changed, 67 insertions(+)
>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 138cb787d27e..5cf8048f2d75 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -186,4 +186,7 @@ amdgpu-y += $(AMD_DISPLAY_FILES)
>   
>   endif
>   
> +#DRM cgroup controller
> +amdgpu-y += amdgpu_drmcgrp.o
> +
>   obj-$(CONFIG_DRM_AMDGPU)+= amdgpu.o
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 30bc345d6fdf..ad0373f83ed3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -33,6 +33,7 @@
>   #include <drm/drm_crtc_helper.h>
>   #include <drm/drm_atomic_helper.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_cgroup.h>
>   #include <linux/vgaarb.h>
>   #include <linux/vga_switcheroo.h>
>   #include <linux/efi.h>
> @@ -2645,6 +2646,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   		goto failed;
>   	}
>   
> +	/* TODO:docs */
> +	if (drmcgrp_vendors[amd_drmcgrp_vendor_id] == NULL)
> +		drmcgrp_register_vendor(&amd_drmcgrp_vendor, amd_drmcgrp_vendor_id);
> +
> +	drmcgrp_register_device(adev->ddev, amd_drmcgrp_vendor_id);
> +

Well that is most likely racy because it is possible that multiple 
instances of the driver initialize at the same time.

Better put the call to drmcgrp_register_vendor() into the module init 
section.

Christian.

>   	return 0;
>   
>   failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> new file mode 100644
> index 000000000000..ed8aac17769c
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> @@ -0,0 +1,37 @@
> +// SPDX-License-Identifier: MIT
> +// Copyright 2018 Advanced Micro Devices, Inc.
> +#include <linux/slab.h>
> +#include <linux/cgroup_drm.h>
> +#include <drm/drm_device.h>
> +#include "amdgpu_drmcgrp.h"
> +
> +struct cftype files[] = {
> +	{ } /* terminate */
> +};
> +
> +struct cftype *drmcgrp_amd_get_cftypes(void)
> +{
> +	return files;
> +}
> +
> +struct drmcgrp_device_resource *amd_drmcgrp_alloc_dev_resource(void)
> +{
> +	struct amd_drmcgrp_dev_resource *a_ddr;
> +
> +	a_ddr = kzalloc(sizeof(struct amd_drmcgrp_dev_resource), GFP_KERNEL);
> +	if (!a_ddr)
> +		return ERR_PTR(-ENOMEM);
> +
> +	return &a_ddr->ddr;
> +}
> +
> +void amd_drmcgrp_free_dev_resource(struct drmcgrp_device_resource *ddr)
> +{
> +	kfree(ddr_amdddr(ddr));
> +}
> +
> +struct drmcgrp_vendor amd_drmcgrp_vendor = {
> +	.get_cftypes = drmcgrp_amd_get_cftypes,
> +	.alloc_dev_resource = amd_drmcgrp_alloc_dev_resource,
> +	.free_dev_resource = amd_drmcgrp_free_dev_resource,
> +};
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> new file mode 100644
> index 000000000000..e2934b7a49f5
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: MIT
> + * Copyright 2018 Advanced Micro Devices, Inc.
> + */
> +#ifndef _AMDGPU_DRMCGRP_H
> +#define _AMDGPU_DRMCGRP_H
> +
> +#include <linux/cgroup_drm.h>
> +
> +/* for AMD specific DRM resources */
> +struct amd_drmcgrp_dev_resource {
> +	struct drmcgrp_device_resource ddr;
> +};
> +
> +static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct drmcgrp_device_resource *ddr)
> +{
> +	return ddr ? container_of(ddr, struct amd_drmcgrp_dev_resource, ddr) : NULL;
> +}
> +
> +#endif	/* _AMDGPU_DRMCGRP_H */
> diff --git a/include/drm/drmcgrp_vendors.h b/include/drm/drmcgrp_vendors.h
> index b04d8649851b..6cfbf1825344 100644
> --- a/include/drm/drmcgrp_vendors.h
> +++ b/include/drm/drmcgrp_vendors.h
> @@ -3,5 +3,6 @@
>    */
>   #if IS_ENABLED(CONFIG_CGROUP_DRM)
>   
> +DRMCGRP_VENDOR(amd)
>   
>   #endif

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
  2018-11-20 18:58   ` [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup Kenny Ho
  2018-11-20 20:57       ` Eric Anholt
@ 2018-11-21  9:58     ` Christian König
  1 sibling, 0 replies; 80+ messages in thread
From: Christian König @ 2018-11-21  9:58 UTC (permalink / raw)
  To: Kenny Ho, y2kenny, cgroups, dri-devel, amd-gfx, intel-gfx

Am 20.11.18 um 19:58 schrieb Kenny Ho:
> Account for the number of command submitted to amdgpu by type on a per
> cgroup basis, for the purpose of profiling/monitoring applications.
>
> x prefix in the control file name x.cmd_submitted.amd.stat signify
> experimental.
>
> Change-Id: Ibc22e5bda600f54fe820fe0af5400ca348691550
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |  5 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 54 +++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  5 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c    | 15 ++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |  5 +-
>   5 files changed, 83 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 663043c8f0f5..b448160aed89 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -33,6 +33,7 @@
>   #include "amdgpu_trace.h"
>   #include "amdgpu_gmc.h"
>   #include "amdgpu_gem.h"
> +#include "amdgpu_drmcgrp.h"
>   
>   static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
>   				      struct drm_amdgpu_cs_chunk_fence *data,
> @@ -1275,6 +1276,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
>   	union drm_amdgpu_cs *cs = data;
>   	struct amdgpu_cs_parser parser = {};
>   	bool reserved_buffers = false;
> +	struct amdgpu_ring *ring;
>   	int i, r;
>   
>   	if (!adev->accel_working)
> @@ -1317,6 +1319,9 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
>   	if (r)
>   		goto out;
>   
> +	ring = to_amdgpu_ring(parser.entity->rq->sched);
> +	amdgpu_drmcgrp_count_cs(current, dev, ring->funcs->type);
> +
>   	r = amdgpu_cs_submit(&parser, cs);
>   
>   out:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> index ed8aac17769c..853b77532428 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> @@ -1,11 +1,65 @@
>   // SPDX-License-Identifier: MIT
>   // Copyright 2018 Advanced Micro Devices, Inc.
>   #include <linux/slab.h>
> +#include <linux/mutex.h>
>   #include <linux/cgroup_drm.h>
>   #include <drm/drm_device.h>
> +#include "amdgpu_ring.h"
>   #include "amdgpu_drmcgrp.h"
>   
> +void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
> +		enum amdgpu_ring_type r_type)
> +{
> +	struct drmcgrp *drmcgrp = get_drmcgrp(task);
> +	struct drmcgrp_device_resource *ddr;
> +	struct drmcgrp *p;
> +	struct amd_drmcgrp_dev_resource *a_ddr;
> +
> +	if (drmcgrp == NULL)
> +		return;
> +
> +	ddr = drmcgrp->dev_resources[dev->primary->index];
> +
> +	mutex_lock(&ddr->ddev->mutex);
> +	for (p = drmcgrp; p != NULL; p = parent_drmcgrp(drmcgrp)) {
> +		a_ddr = ddr_amdddr(p->dev_resources[dev->primary->index]);
> +
> +		a_ddr->cs_count[r_type]++;
> +	}
> +	mutex_unlock(&ddr->ddev->mutex);
> +}
> +
> +int amd_drmcgrp_cmd_submit_accounting_read(struct seq_file *sf, void *v)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> +	struct drmcgrp_device_resource *ddr = NULL;
> +	struct amd_drmcgrp_dev_resource *a_ddr = NULL;
> +	int i, j;
> +
> +	seq_puts(sf, "---\n");
> +	for (i = 0; i < MAX_DRM_DEV; i++) {
> +		ddr = drmcgrp->dev_resources[i];
> +
> +		if (ddr == NULL || ddr->ddev->vid != amd_drmcgrp_vendor_id)
> +			continue;
> +
> +		a_ddr = ddr_amdddr(ddr);
> +
> +		seq_printf(sf, "card%d:\n", i);
> +		for (j = 0; j < __MAX_AMDGPU_RING_TYPE; j++)
> +			seq_printf(sf, "  %s: %llu\n", amdgpu_ring_names[j], a_ddr->cs_count[j]);
> +	}
> +
> +	return 0;
> +}
> +
> +
>   struct cftype files[] = {
> +	{
> +		.name = "x.cmd_submitted.amd.stat",
> +		.seq_show = amd_drmcgrp_cmd_submit_accounting_read,
> +		.flags = CFTYPE_NOT_ON_ROOT,
> +	},
>   	{ } /* terminate */
>   };
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> index e2934b7a49f5..f894a9a1059f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> @@ -5,12 +5,17 @@
>   #define _AMDGPU_DRMCGRP_H
>   
>   #include <linux/cgroup_drm.h>
> +#include "amdgpu_ring.h"
>   
>   /* for AMD specific DRM resources */
>   struct amd_drmcgrp_dev_resource {
>   	struct drmcgrp_device_resource ddr;
> +	u64 cs_count[__MAX_AMDGPU_RING_TYPE];
>   };
>   
> +void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
> +		enum amdgpu_ring_type r_type);
> +
>   static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct drmcgrp_device_resource *ddr)
>   {
>   	return ddr ? container_of(ddr, struct amd_drmcgrp_dev_resource, ddr) : NULL;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index b70e85ec147d..1606f84d2334 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -34,6 +34,21 @@
>   #include "amdgpu.h"
>   #include "atom.h"
>   
> +char const *amdgpu_ring_names[] = {
> +	[AMDGPU_RING_TYPE_GFX]		= "gfx",
> +	[AMDGPU_RING_TYPE_COMPUTE]	= "compute",
> +	[AMDGPU_RING_TYPE_SDMA]		= "sdma",
> +	[AMDGPU_RING_TYPE_UVD]		= "uvd",
> +	[AMDGPU_RING_TYPE_VCE]		= "vce",
> +	[AMDGPU_RING_TYPE_KIQ]		= "kiq",
> +	[AMDGPU_RING_TYPE_UVD_ENC]	= "uvd_enc",
> +	[AMDGPU_RING_TYPE_VCN_DEC]	= "vcn_dec",
> +	[AMDGPU_RING_TYPE_VCN_ENC]	= "vcn_enc",
> +	[AMDGPU_RING_TYPE_VCN_JPEG]	= "vcn_jpeg",
> +	[__MAX_AMDGPU_RING_TYPE]	= "_max"
> +
> +};
> +

Each ring already has a dedicated name, so that looks like a duplication 
to me.

What could be is that we need this for the ring type name, but then you 
need to rename it.

And please don't use something like "__MAX_AMDGPU....", just name that 
AMDGPU_RING_TYPE_LAST.

Christian.

>   /*
>    * Rings
>    * Most engines on the GPU are fed via ring buffers.  Ring
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 4caa301ce454..e292b7e89c6a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -56,9 +56,12 @@ enum amdgpu_ring_type {
>   	AMDGPU_RING_TYPE_UVD_ENC,
>   	AMDGPU_RING_TYPE_VCN_DEC,
>   	AMDGPU_RING_TYPE_VCN_ENC,
> -	AMDGPU_RING_TYPE_VCN_JPEG
> +	AMDGPU_RING_TYPE_VCN_JPEG,
> +	__MAX_AMDGPU_RING_TYPE
>   };
>   
> +extern char const *amdgpu_ring_names[];
> +
>   struct amdgpu_device;
>   struct amdgpu_ring;
>   struct amdgpu_ib;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
       [not found]   ` <20181120185814.13362-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2018-11-20 20:56       ` Eric Anholt
@ 2018-11-21 10:00     ` Christian König
  2018-11-27 18:15       ` Kenny Ho
  1 sibling, 1 reply; 80+ messages in thread
From: Christian König @ 2018-11-21 10:00 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 20.11.18 um 19:58 schrieb Kenny Ho:
> Account for the total size of buffer object requested to amdgpu by
> buffer type on a per cgroup basis.
>
> x prefix in the control file name x.bo_requested.amd.stat signify
> experimental.
>
> Change-Id: Ifb680c4bcf3652879a7a659510e25680c2465cf6
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 56 +++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c     | 13 +++++
>   include/uapi/drm/amdgpu_drm.h               | 24 ++++++---
>   4 files changed, 90 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> index 853b77532428..e3d98ed01b79 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
> @@ -7,6 +7,57 @@
>   #include "amdgpu_ring.h"
>   #include "amdgpu_drmcgrp.h"
>   
> +void amdgpu_drmcgrp_count_bo_req(struct task_struct *task, struct drm_device *dev,
> +		u32 domain, unsigned long size)
> +{
> +	struct drmcgrp *drmcgrp = get_drmcgrp(task);
> +	struct drmcgrp_device_resource *ddr;
> +	struct drmcgrp *p;
> +	struct amd_drmcgrp_dev_resource *a_ddr;
> +        int i;
> +
> +	if (drmcgrp == NULL)
> +		return;
> +
> +	ddr = drmcgrp->dev_resources[dev->primary->index];
> +
> +	mutex_lock(&ddr->ddev->mutex);
> +	for (p = drmcgrp; p != NULL; p = parent_drmcgrp(drmcgrp)) {
> +		a_ddr = ddr_amdddr(p->dev_resources[dev->primary->index]);
> +
> +		for (i = 0; i < __MAX_AMDGPU_MEM_DOMAIN; i++)
> +			if ( (1 << i) & domain)
> +				a_ddr->bo_req_count[i] += size;
> +	}
> +	mutex_unlock(&ddr->ddev->mutex);
> +}
> +
> +int amd_drmcgrp_bo_req_stat_read(struct seq_file *sf, void *v)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> +	struct drmcgrp_device_resource *ddr = NULL;
> +	struct amd_drmcgrp_dev_resource *a_ddr = NULL;
> +	int i, j;
> +
> +	seq_puts(sf, "---\n");
> +	for (i = 0; i < MAX_DRM_DEV; i++) {
> +		ddr = drmcgrp->dev_resources[i];
> +
> +		if (ddr == NULL || ddr->ddev->vid != amd_drmcgrp_vendor_id)
> +			continue;
> +
> +		a_ddr = ddr_amdddr(ddr);
> +
> +		seq_printf(sf, "card%d:\n", i);
> +		for (j = 0; j < __MAX_AMDGPU_MEM_DOMAIN; j++)
> +			seq_printf(sf, "  %s: %llu\n", amdgpu_mem_domain_names[j], a_ddr->bo_req_count[j]);
> +	}
> +
> +	return 0;
> +}
> +
> +
> +
>   void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
>   		enum amdgpu_ring_type r_type)
>   {
> @@ -55,6 +106,11 @@ int amd_drmcgrp_cmd_submit_accounting_read(struct seq_file *sf, void *v)
>   
>   
>   struct cftype files[] = {
> +	{
> +		.name = "x.bo_requested.amd.stat",
> +		.seq_show = amd_drmcgrp_bo_req_stat_read,
> +		.flags = CFTYPE_NOT_ON_ROOT,
> +	},
>   	{
>   		.name = "x.cmd_submitted.amd.stat",
>   		.seq_show = amd_drmcgrp_cmd_submit_accounting_read,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> index f894a9a1059f..8b9d61e47dde 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
> @@ -11,10 +11,13 @@
>   struct amd_drmcgrp_dev_resource {
>   	struct drmcgrp_device_resource ddr;
>   	u64 cs_count[__MAX_AMDGPU_RING_TYPE];
> +	u64 bo_req_count[__MAX_AMDGPU_MEM_DOMAIN];
>   };
>   
>   void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
>   		enum amdgpu_ring_type r_type);
> +void amdgpu_drmcgrp_count_bo_req(struct task_struct *task, struct drm_device *dev,
> +		u32 domain, unsigned long size);
>   
>   static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct drmcgrp_device_resource *ddr)
>   {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 7b3d1ebda9df..339e1d3edad8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -31,6 +31,17 @@
>   #include <drm/amdgpu_drm.h>
>   #include "amdgpu.h"
>   #include "amdgpu_display.h"
> +#include "amdgpu_drmcgrp.h"
> +
> +char const *amdgpu_mem_domain_names[] = {
> +	[AMDGPU_MEM_DOMAIN_CPU]		= "cpu",
> +	[AMDGPU_MEM_DOMAIN_GTT]		= "gtt",
> +	[AMDGPU_MEM_DOMAIN_VRAM]	= "vram",
> +	[AMDGPU_MEM_DOMAIN_GDS]		= "gds",
> +	[AMDGPU_MEM_DOMAIN_GWS]		= "gws",
> +	[AMDGPU_MEM_DOMAIN_OA]		= "oa",
> +	[__MAX_AMDGPU_MEM_DOMAIN]	= "_max"
> +};
>   
>   void amdgpu_gem_object_free(struct drm_gem_object *gobj)
>   {
> @@ -52,6 +63,8 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
>   	struct amdgpu_bo_param bp;
>   	int r;
>   
> +	amdgpu_drmcgrp_count_bo_req(current, adev->ddev, initial_domain, size);
> +
>   	memset(&bp, 0, sizeof(bp));
>   	*obj = NULL;
>   	/* At least align on page size */
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 370e9a5536ef..531726443104 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -72,6 +72,18 @@ extern "C" {
>   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>   #define DRM_IOCTL_AMDGPU_SCHED		DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>   
> +enum amdgpu_mem_domain {
> +	AMDGPU_MEM_DOMAIN_CPU,
> +	AMDGPU_MEM_DOMAIN_GTT,
> +	AMDGPU_MEM_DOMAIN_VRAM,
> +	AMDGPU_MEM_DOMAIN_GDS,
> +	AMDGPU_MEM_DOMAIN_GWS,
> +	AMDGPU_MEM_DOMAIN_OA,
> +	__MAX_AMDGPU_MEM_DOMAIN
> +};

Well that is a clear NAK since it duplicates the TTM defines. Please use 
that one instead and don't make this UAPI.

Christian.

> +
> +extern char const *amdgpu_mem_domain_names[];
> +
>   /**
>    * DOC: memory domains
>    *
> @@ -95,12 +107,12 @@ extern "C" {
>    * %AMDGPU_GEM_DOMAIN_OA	Ordered append, used by 3D or Compute engines
>    * for appending data.
>    */
> -#define AMDGPU_GEM_DOMAIN_CPU		0x1
> -#define AMDGPU_GEM_DOMAIN_GTT		0x2
> -#define AMDGPU_GEM_DOMAIN_VRAM		0x4
> -#define AMDGPU_GEM_DOMAIN_GDS		0x8
> -#define AMDGPU_GEM_DOMAIN_GWS		0x10
> -#define AMDGPU_GEM_DOMAIN_OA		0x20
> +#define AMDGPU_GEM_DOMAIN_CPU		(1 << AMDGPU_MEM_DOMAIN_CPU)
> +#define AMDGPU_GEM_DOMAIN_GTT		(1 << AMDGPU_MEM_DOMAIN_GTT)
> +#define AMDGPU_GEM_DOMAIN_VRAM		(1 << AMDGPU_MEM_DOMAIN_VRAM)
> +#define AMDGPU_GEM_DOMAIN_GDS		(1 << AMDGPU_MEM_DOMAIN_GDS)
> +#define AMDGPU_GEM_DOMAIN_GWS		(1 << AMDGPU_MEM_DOMAIN_GWS)
> +#define AMDGPU_GEM_DOMAIN_OA		(1 << AMDGPU_MEM_DOMAIN_OA)
>   #define AMDGPU_GEM_DOMAIN_MASK		(AMDGPU_GEM_DOMAIN_CPU | \
>   					 AMDGPU_GEM_DOMAIN_GTT | \
>   					 AMDGPU_GEM_DOMAIN_VRAM | \

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
       [not found]       ` <87r2ff79he.fsf-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
@ 2018-11-21 10:03         ` Christian König
  2018-11-23 17:36           ` Eric Anholt
  0 siblings, 1 reply; 80+ messages in thread
From: Christian König @ 2018-11-21 10:03 UTC (permalink / raw)
  To: Eric Anholt, Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 769 bytes --]

Am 20.11.18 um 21:57 schrieb Eric Anholt:
> Kenny Ho <Kenny.Ho-5C7GfCeVMHo@public.gmane.org> writes:
>
>> Account for the number of command submitted to amdgpu by type on a per
>> cgroup basis, for the purpose of profiling/monitoring applications.
> For profiling other drivers, I've used perf tracepoints, which let you
> get useful timelines of multiple events in the driver.  Have you made
> use of this stat for productive profiling?

Yes, but this is not related to profiling at all.

What we want to do is to limit the resource usage of processes.

Regards,
Christian.

>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 1867 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]                 ` <20181120223018.GB2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
@ 2018-11-21 22:07                   ` Ho, Kenny
  2018-11-21 22:12                   ` Ho, Kenny
  2018-11-26 20:59                   ` Kasiviswanathan, Harish
  2 siblings, 0 replies; 80+ messages in thread
From: Ho, Kenny @ 2018-11-21 22:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 4390 bytes --]

Hi Tejun,

On Tue, Nov 20, 2018 at 5:30 PM Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > By this reply, are you suggesting that vendor specific resources
> > will never be acceptable to be managed under cgroup?  Let say a user
>
> I wouldn't say never but whatever which gets included as a cgroup
> controller should have clearly defined resource abstractions and the
> control schemes around them including support for delegation.  AFAICS,
> gpu side still seems to have a long way to go (and it's not clear
> whether that's somewhere it will or needs to end up).
Right, I totally understand that it's not obvious from this RFC because the 'resource' counting demonstrated in this RFC is trivial in nature, mostly to illustrate the 'vendor' concept.  The structure of this patch actually give us the ability to support both abstracted resources you mentioned and vendor specific resources.  But it is probably not very clear as the RFC only includes two resources and they are both vendor specific.  To be clear, I am not saying there aren't abstracted resources in drm, there are (we are still working on those).  What I am saying is that not all resources are abstracted and for the purpose of this RFC I was hoping to get some feedback on the vendor specific parts early just so that we don't go down the wrong path.

That said, I think I am getting a better sense of what you are saying.  Please correct me if I misinterpreted: your concern is that abstracting by vendor is too high level and it's too much of a free-for-all.  Instead, resources should be abstracted at the controller level even if it's only available to a specific vendor (or even a specific product from a specific vendor).  Is that a fair read?

A couple of additional side questions:
* Is statistic/accounting-only use cases like those enabled by cpuacct controller no longer sufficient?  If it is still sufficient, can you elaborate more on what you mean by having control schemes and supporting delegation?
* When you wrote delegation, do you mean delegation in the sense described in https://www.kernel.org/doc/Documentation/cgroup-v2.txt ?

> > To put the questions in more concrete terms, let say a user wants to
> > expose certain part of a gpu to a particular cgroup similar to the
> > way selective cpu cores are exposed to a cgroup via cpuset, how
> > should we go about enabling such functionality?
>
> Do what the intel driver or bpf is doing?  It's not difficult to hook
> into cgroup for identification purposes.
Does intel driver or bpf present an interface file in cgroupfs for users to configure the core selection like cpuset?  I must admit I am not too familiar with the bpf case as I was referencing mostly the way rdma was implemented when putting this RFC together.


Perhaps I wasn't communicating clearly so let me see if I can illustrate this discussion with a hypothetical but concrete example using our competitor's product.  Nvidia has something called Tensor Cores in some of their GPUs and the purpose of those cores is to accelerate matrix operations for machine learning applications.  This is something unique to Nvidia and to my knowledge no one else has something like it.  These cores are different from regular shader processors and there are multiple of them in a GPU.

Under the structure of this RFC, if Nvidia wants to make Tensor Cores manageable via cgroup (with the "Allocation" distribution model let say), they will probably have an interface file called "drm.nvidia.tensor_core", in which only nvidia's GPUs will be listed.  If a GPU has TC, it will have a positive count, otherwise 0.

If I understand you correctly Tejun, is that they should not do that.  What they should do is have an abstracted resource, possibly named "drm.matrix_accelerator" where all drm devices available on a system will be listed.  All GPUs except some Nvidia's will have a count of 0.  Or perhaps that is not sufficiently abstracted so instead there should be just "drm.cores" instead and that file list both device, core types and count.  For one vendor they may have shader proc, texture map unit, tensor core, ray tracing cores as types.  Others may have ALUs, EUs and subslices.

Is that an accurate representation of what you are recommending?

Regards,
Kenny


[-- Attachment #1.2: Type: text/html, Size: 5553 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]                 ` <20181120223018.GB2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
  2018-11-21 22:07                   ` Ho, Kenny
@ 2018-11-21 22:12                   ` Ho, Kenny
  2018-11-26 20:59                   ` Kasiviswanathan, Harish
  2 siblings, 0 replies; 80+ messages in thread
From: Ho, Kenny @ 2018-11-21 22:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

(resending because previous email switched to HTML mode and was filtered out)

Hi Tejun,

On Tue, Nov 20, 2018 at 5:30 PM Tejun Heo <tj@kernel.org> wrote:
> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > By this reply, are you suggesting that vendor specific resources
> > will never be acceptable to be managed under cgroup?  Let say a user
>
> I wouldn't say never but whatever which gets included as a cgroup
> controller should have clearly defined resource abstractions and the
> control schemes around them including support for delegation.  AFAICS,
> gpu side still seems to have a long way to go (and it's not clear
> whether that's somewhere it will or needs to end up).
Right, I totally understand that it's not obvious from this RFC because the 'resource' counting demonstrated in this RFC is trivial in nature, mostly to illustrate the 'vendor' concept.  The structure of this patch actually give us the ability to support both abstracted resources you mentioned and vendor specific resources.  It is probably not obvious as the RFC only includes two resources and they are both vendor specific.  To be clear, I am not saying there aren't abstracted resources in drm, there are (we are still working on those).  What I am saying is that not all resources are abstracted and for the purpose of this RFC I was hoping to get some feedback on the vendor specific parts early just so that we don't go down the wrong path.

That said, I think I am getting a better sense of what you are saying.  Please correct me if I misinterpreted: your concern is that abstracting by vendor is too high level and it's too much of a free-for-all.  Instead, resources should be abstracted at the controller level even if it's only available to a specific vendor (or even a specific product from a specific vendor).  Is that a fair read?

A couple of additional side questions:
* Is statistic/accounting-only use cases like those enabled by cpuacct controller no longer sufficient?  If it is still sufficient, can you elaborate more on what you mean by having control schemes and supporting delegation?
* When you wrote delegation, do you mean delegation in the sense described in https://www.kernel.org/doc/Documentation/cgroup-v2.txt ?

> > To put the questions in more concrete terms, let say a user wants to
> > expose certain part of a gpu to a particular cgroup similar to the
> > way selective cpu cores are exposed to a cgroup via cpuset, how
> > should we go about enabling such functionality?
>
> Do what the intel driver or bpf is doing?  It's not difficult to hook
> into cgroup for identification purposes.
Does intel driver or bpf present an interface file in cgroupfs for users to configure the core selection like cpuset?  I must admit I am not too familiar with the bpf case as I was referencing mostly the way rdma was implemented when putting this RFC together.


Perhaps I wasn't communicating clearly so let me see if I can illustrate this discussion with a hypothetical but concrete example using our competitor's product.  Nvidia has something called Tensor Cores in some of their GPUs and the purpose of those cores is to accelerate matrix operations for machine learning applications.  This is something unique to Nvidia and to my knowledge no one else has something like it.  These cores are different from regular shader processors and there are multiple of them in a GPU.

Under the structure of this RFC, if Nvidia wants to make Tensor Cores manageable via cgroup (with the "Allocation" distribution model let say), they will probably have an interface file called "drm.nvidia.tensor_core", in which only nvidia's GPUs will be listed.  If a GPU has TC, it will have a positive count, otherwise 0.

If I understand you correctly Tejun, is that they should not do that.  What they should do is have an abstracted resource, possibly named "drm.matrix_accelerator" where all drm devices available on a system will be listed.  All GPUs except some Nvidia's will have a count of 0.  Or perhaps that is not sufficiently abstracted so instead there should be just "drm.cores" instead and that file list both device, core types and count.  For one vendor they may have shader proc, texture map unit, tensor core, ray tracing cores as types.  Others may have ALUs, EUs and subslices.

Is that an accurate representation of what you are recommending?

Regards,
Kenny 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
  2018-11-21 10:03         ` Christian König
@ 2018-11-23 17:36           ` Eric Anholt
       [not found]             ` <871s7b7l2b.fsf-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Eric Anholt @ 2018-11-23 17:36 UTC (permalink / raw)
  To: christian.koenig, Kenny Ho, y2kenny, cgroups, dri-devel, amd-gfx,
	intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 894 bytes --]

Christian König <ckoenig.leichtzumerken@gmail.com> writes:

> Am 20.11.18 um 21:57 schrieb Eric Anholt:
>> Kenny Ho <Kenny.Ho@amd.com> writes:
>>
>>> Account for the number of command submitted to amdgpu by type on a per
>>> cgroup basis, for the purpose of profiling/monitoring applications.
>> For profiling other drivers, I've used perf tracepoints, which let you
>> get useful timelines of multiple events in the driver.  Have you made
>> use of this stat for productive profiling?
>
> Yes, but this is not related to profiling at all.
>
> What we want to do is to limit the resource usage of processes.

That sounds great, and something I'd be interested in for vc4.  However,
as far as I saw explained here, this patch doesn't let you limit
resource usage of a process and is only useful for
"profiling/monitoring" so I'm wondering how it is useful for that
purpose.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
       [not found]             ` <871s7b7l2b.fsf-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
@ 2018-11-23 18:13                 ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2018-11-23 18:13 UTC (permalink / raw)
  To: Eric Anholt, Ho, Kenny, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 23.11.18 um 18:36 schrieb Eric Anholt:
> Christian König <ckoenig.leichtzumerken@gmail.com> writes:
>
>> Am 20.11.18 um 21:57 schrieb Eric Anholt:
>>> Kenny Ho <Kenny.Ho@amd.com> writes:
>>>
>>>> Account for the number of command submitted to amdgpu by type on a per
>>>> cgroup basis, for the purpose of profiling/monitoring applications.
>>> For profiling other drivers, I've used perf tracepoints, which let you
>>> get useful timelines of multiple events in the driver.  Have you made
>>> use of this stat for productive profiling?
>> Yes, but this is not related to profiling at all.
>>
>> What we want to do is to limit the resource usage of processes.
> That sounds great, and something I'd be interested in for vc4.  However,
> as far as I saw explained here, this patch doesn't let you limit
> resource usage of a process and is only useful for
> "profiling/monitoring" so I'm wondering how it is useful for that
> purpose.

Ok, good to know. I haven't looked at this in deep, but if this is just 
for accounting that would certainly be missing the goal.

Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
@ 2018-11-23 18:13                 ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2018-11-23 18:13 UTC (permalink / raw)
  To: Eric Anholt, Ho, Kenny, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 23.11.18 um 18:36 schrieb Eric Anholt:
> Christian König <ckoenig.leichtzumerken@gmail.com> writes:
>
>> Am 20.11.18 um 21:57 schrieb Eric Anholt:
>>> Kenny Ho <Kenny.Ho@amd.com> writes:
>>>
>>>> Account for the number of command submitted to amdgpu by type on a per
>>>> cgroup basis, for the purpose of profiling/monitoring applications.
>>> For profiling other drivers, I've used perf tracepoints, which let you
>>> get useful timelines of multiple events in the driver.  Have you made
>>> use of this stat for productive profiling?
>> Yes, but this is not related to profiling at all.
>>
>> What we want to do is to limit the resource usage of processes.
> That sounds great, and something I'd be interested in for vc4.  However,
> as far as I saw explained here, this patch doesn't let you limit
> resource usage of a process and is only useful for
> "profiling/monitoring" so I'm wondering how it is useful for that
> purpose.

Ok, good to know. I haven't looked at this in deep, but if this is just 
for accounting that would certainly be missing the goal.

Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
       [not found]                 ` <095e010c-e3b8-ec79-c87b-a05ce1d95e10-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-23 19:09                     ` Ho, Kenny
  0 siblings, 0 replies; 80+ messages in thread
From: Ho, Kenny @ 2018-11-23 19:09 UTC (permalink / raw)
  To: Koenig, Christian, Eric Anholt, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Nov 23, 2018 at 1:13 PM Koenig, Christian <Christian.Koenig@amd.com> wrote:
> Am 23.11.18 um 18:36 schrieb Eric Anholt:
> > Christian König <ckoenig.leichtzumerken@gmail.com> writes:
> >> Am 20.11.18 um 21:57 schrieb Eric Anholt:
> >>> Kenny Ho <Kenny.Ho@amd.com> writes:
> >>>> Account for the number of command submitted to amdgpu by type on a per
> >>>> cgroup basis, for the purpose of profiling/monitoring applications.
> >>> For profiling other drivers, I've used perf tracepoints, which let you
> >>> get useful timelines of multiple events in the driver.  Have you made
> >>> use of this stat for productive profiling?
> >> Yes, but this is not related to profiling at all.
> >>
> >> What we want to do is to limit the resource usage of processes.
> > That sounds great, and something I'd be interested in for vc4.  However,
> > as far as I saw explained here, this patch doesn't let you limit
> > resource usage of a process and is only useful for
> > "profiling/monitoring" so I'm wondering how it is useful for that
> > purpose.
>
> Ok, good to know. I haven't looked at this in deep, but if this is just
> for accounting that would certainly be missing the goal.
The end goal is to have limit in place.  The current patch is mostly to illustrate the structure of the controller and get some early feedback.  I will have more soon.

Regards,
Kenny
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup
@ 2018-11-23 19:09                     ` Ho, Kenny
  0 siblings, 0 replies; 80+ messages in thread
From: Ho, Kenny @ 2018-11-23 19:09 UTC (permalink / raw)
  To: Koenig, Christian, Eric Anholt, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Nov 23, 2018 at 1:13 PM Koenig, Christian <Christian.Koenig@amd.com> wrote:
> Am 23.11.18 um 18:36 schrieb Eric Anholt:
> > Christian König <ckoenig.leichtzumerken@gmail.com> writes:
> >> Am 20.11.18 um 21:57 schrieb Eric Anholt:
> >>> Kenny Ho <Kenny.Ho@amd.com> writes:
> >>>> Account for the number of command submitted to amdgpu by type on a per
> >>>> cgroup basis, for the purpose of profiling/monitoring applications.
> >>> For profiling other drivers, I've used perf tracepoints, which let you
> >>> get useful timelines of multiple events in the driver.  Have you made
> >>> use of this stat for productive profiling?
> >> Yes, but this is not related to profiling at all.
> >>
> >> What we want to do is to limit the resource usage of processes.
> > That sounds great, and something I'd be interested in for vc4.  However,
> > as far as I saw explained here, this patch doesn't let you limit
> > resource usage of a process and is only useful for
> > "profiling/monitoring" so I'm wondering how it is useful for that
> > purpose.
>
> Ok, good to know. I haven't looked at this in deep, but if this is just
> for accounting that would certainly be missing the goal.
The end goal is to have limit in place.  The current patch is mostly to illustrate the structure of the controller and get some early feedback.  I will have more soon.

Regards,
Kenny
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]                 ` <20181120223018.GB2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
  2018-11-21 22:07                   ` Ho, Kenny
  2018-11-21 22:12                   ` Ho, Kenny
@ 2018-11-26 20:59                   ` Kasiviswanathan, Harish
  2018-11-27  9:38                     ` Koenig, Christian
  2018-11-27  9:46                     ` [Intel-gfx] " Joonas Lahtinen
  2 siblings, 2 replies; 80+ messages in thread
From: Kasiviswanathan, Harish @ 2018-11-26 20:59 UTC (permalink / raw)
  To: Tejun Heo, Ho, Kenny
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Thanks Tejun,Eric and Christian for your replies.

We want GPUs resource management to work seamlessly with containers and container orchestration. With the Intel / bpf based approach this is not possible. 

From your response we gather the following. GPU resources need to be abstracted. We will send a new proposal in same vein. Our current thinking is to start with a single abstracted resource and build a framework that can be expanded to include additional resources. We plan to start with “GPU cores”. We believe all GPUs have some concept of cores or compute unit.

Your feedback is highly appreciated.

Best Regards,
Harish



From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
Sent: Tuesday, November 20, 2018 5:30 PM
To: Ho, Kenny
Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  

Hello,

On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> By this reply, are you suggesting that vendor specific resources
> will never be acceptable to be managed under cgroup?  Let say a user

I wouldn't say never but whatever which gets included as a cgroup
controller should have clearly defined resource abstractions and the
control schemes around them including support for delegation.  AFAICS,
gpu side still seems to have a long way to go (and it's not clear
whether that's somewhere it will or needs to end up).

> want to have similar functionality as what cgroup is offering but to
> manage vendor specific resources, what would you suggest as a
> solution?  When you say keeping vendor specific resource regulation
> inside drm or specific drivers, do you mean we should replicate the
> cgroup infrastructure there or do you mean either drm or specific
> driver should query existing hierarchy (such as device or perhaps
> cpu) for the process organization information?
> 
> To put the questions in more concrete terms, let say a user wants to
> expose certain part of a gpu to a particular cgroup similar to the
> way selective cpu cores are exposed to a cgroup via cpuset, how
> should we go about enabling such functionality?

Do what the intel driver or bpf is doing?  It's not difficult to hook
into cgroup for identification purposes.

Thanks.

-- 
tejun
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


amd-gfx Info Page - freedesktop.org
lists.freedesktop.org
To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
    
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-11-26 20:59                   ` Kasiviswanathan, Harish
@ 2018-11-27  9:38                     ` Koenig, Christian
  2018-11-27  9:46                     ` [Intel-gfx] " Joonas Lahtinen
  1 sibling, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2018-11-27  9:38 UTC (permalink / raw)
  To: Kasiviswanathan, Harish, Tejun Heo, Ho, Kenny
  Cc: cgroups, intel-gfx, y2kenny, amd-gfx, dri-devel

Hi Harish,

Am 26.11.18 um 21:59 schrieb Kasiviswanathan, Harish:
> Thanks Tejun,Eric and Christian for your replies.
>
> We want GPUs resource management to work seamlessly with containers and container orchestration. With the Intel / bpf based approach this is not possible.

I think one lesson learned is that we should describe this goal in the 
patch covert letter when sending it out. That could have avoid something 
like have of the initial confusion.

>  From your response we gather the following. GPU resources need to be abstracted. We will send a new proposal in same vein. Our current thinking is to start with a single abstracted resource and build a framework that can be expanded to include additional resources. We plan to start with “GPU cores”. We believe all GPUs have some concept of cores or compute unit.

Sounds good, just one comment on creating a framework: Before doing 
something like this think for a moment if it doesn't make sense to 
rather extend the existing cgroup framework. That approach usually makes 
more sense because you rarely need something fundamentally new.

Regards,
Christian.

>
> Your feedback is highly appreciated.
>
> Best Regards,
> Harish
>
>
>
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> Sent: Tuesday, November 20, 2018 5:30 PM
> To: Ho, Kenny
> Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
>    
>
> Hello,
>
> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
>> By this reply, are you suggesting that vendor specific resources
>> will never be acceptable to be managed under cgroup?  Let say a user
> I wouldn't say never but whatever which gets included as a cgroup
> controller should have clearly defined resource abstractions and the
> control schemes around them including support for delegation.  AFAICS,
> gpu side still seems to have a long way to go (and it's not clear
> whether that's somewhere it will or needs to end up).
>
>> want to have similar functionality as what cgroup is offering but to
>> manage vendor specific resources, what would you suggest as a
>> solution?  When you say keeping vendor specific resource regulation
>> inside drm or specific drivers, do you mean we should replicate the
>> cgroup infrastructure there or do you mean either drm or specific
>> driver should query existing hierarchy (such as device or perhaps
>> cpu) for the process organization information?
>>
>> To put the questions in more concrete terms, let say a user wants to
>> expose certain part of a gpu to a particular cgroup similar to the
>> way selective cpu cores are exposed to a cgroup via cpuset, how
>> should we go about enabling such functionality?
> Do what the intel driver or bpf is doing?  It's not difficult to hook
> into cgroup for identification purposes.
>
> Thanks.
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-11-26 20:59                   ` Kasiviswanathan, Harish
  2018-11-27  9:38                     ` Koenig, Christian
@ 2018-11-27  9:46                     ` Joonas Lahtinen
  2018-11-27 15:41                       ` Ho, Kenny
  1 sibling, 1 reply; 80+ messages in thread
From: Joonas Lahtinen @ 2018-11-27  9:46 UTC (permalink / raw)
  To: Ho, Kenny, Kasiviswanathan, Harish, Tejun Heo
  Cc: cgroups, intel-gfx, y2kenny, amd-gfx, dri-devel

Quoting Kasiviswanathan, Harish (2018-11-26 22:59:30)
> Thanks Tejun,Eric and Christian for your replies.
> 
> We want GPUs resource management to work seamlessly with containers and container orchestration. With the Intel / bpf based approach this is not possible. 
> 
> From your response we gather the following. GPU resources need to be abstracted. We will send a new proposal in same vein. Our current thinking is to start with a single abstracted resource and build a framework that can be expanded to include additional resources. We plan to start with “GPU cores”. We believe all GPUs have some concept of cores or compute unit.

I think a more abstract property "% of GPU (processing power)" might
be a more universal approach. One can then implement that through
subdividing the resources or timeslicing them, depending on the GPU
topology.

Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
applicable to cloud provider usecases, too. At least that's what I
see done for the CPUs today.

That combined with the "GPU memory usable" property should be a good
starting point to start subdividing the GPU resources for multiple
users.

Regards, Joonas

> 
> Your feedback is highly appreciated.
> 
> Best Regards,
> Harish
> 
> 
> 
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> Sent: Tuesday, November 20, 2018 5:30 PM
> To: Ho, Kenny
> Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
>   
> 
> Hello,
> 
> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > By this reply, are you suggesting that vendor specific resources
> > will never be acceptable to be managed under cgroup?  Let say a user
> 
> I wouldn't say never but whatever which gets included as a cgroup
> controller should have clearly defined resource abstractions and the
> control schemes around them including support for delegation.  AFAICS,
> gpu side still seems to have a long way to go (and it's not clear
> whether that's somewhere it will or needs to end up).
> 
> > want to have similar functionality as what cgroup is offering but to
> > manage vendor specific resources, what would you suggest as a
> > solution?  When you say keeping vendor specific resource regulation
> > inside drm or specific drivers, do you mean we should replicate the
> > cgroup infrastructure there or do you mean either drm or specific
> > driver should query existing hierarchy (such as device or perhaps
> > cpu) for the process organization information?
> > 
> > To put the questions in more concrete terms, let say a user wants to
> > expose certain part of a gpu to a particular cgroup similar to the
> > way selective cpu cores are exposed to a cgroup via cpuset, how
> > should we go about enabling such functionality?
> 
> Do what the intel driver or bpf is doing?  It's not difficult to hook
> into cgroup for identification purposes.
> 
> Thanks.
> 
> -- 
> tejun
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> 
> amd-gfx Info Page - freedesktop.org
> lists.freedesktop.org
> To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
>     
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-11-27  9:46                     ` [Intel-gfx] " Joonas Lahtinen
@ 2018-11-27 15:41                       ` Ho, Kenny
  2018-11-28  9:14                         ` Joonas Lahtinen
  0 siblings, 1 reply; 80+ messages in thread
From: Ho, Kenny @ 2018-11-27 15:41 UTC (permalink / raw)
  To: Joonas Lahtinen, Kasiviswanathan, Harish, Tejun Heo
  Cc: cgroups, intel-gfx, y2kenny, amd-gfx, dri-devel

On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> I think a more abstract property "% of GPU (processing power)" might
> be a more universal approach. One can then implement that through
> subdividing the resources or timeslicing them, depending on the GPU
> topology.
>
> Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
> applicable to cloud provider usecases, too. At least that's what I
> see done for the CPUs today.
I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.

Regards,
Kenny

> That combined with the "GPU memory usable" property should be a good
> starting point to start subdividing the GPU resources for multiple
> users.
>
> Regards, Joonas
>
> >
> > Your feedback is highly appreciated.
> >
> > Best Regards,
> > Harish
> >
> >
> >
> > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> > Sent: Tuesday, November 20, 2018 5:30 PM
> > To: Ho, Kenny
> > Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> >
> >
> > Hello,
> >
> > On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > > By this reply, are you suggesting that vendor specific resources
> > > will never be acceptable to be managed under cgroup?  Let say a user
> >
> > I wouldn't say never but whatever which gets included as a cgroup
> > controller should have clearly defined resource abstractions and the
> > control schemes around them including support for delegation.  AFAICS,
> > gpu side still seems to have a long way to go (and it's not clear
> > whether that's somewhere it will or needs to end up).
> >
> > > want to have similar functionality as what cgroup is offering but to
> > > manage vendor specific resources, what would you suggest as a
> > > solution?  When you say keeping vendor specific resource regulation
> > > inside drm or specific drivers, do you mean we should replicate the
> > > cgroup infrastructure there or do you mean either drm or specific
> > > driver should query existing hierarchy (such as device or perhaps
> > > cpu) for the process organization information?
> > >
> > > To put the questions in more concrete terms, let say a user wants to
> > > expose certain part of a gpu to a particular cgroup similar to the
> > > way selective cpu cores are exposed to a cgroup via cpuset, how
> > > should we go about enabling such functionality?
> >
> > Do what the intel driver or bpf is doing?  It's not difficult to hook
> > into cgroup for identification purposes.
> >
> > Thanks.
> >
> > --
> > tejun
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> >
> > amd-gfx Info Page - freedesktop.org
> > lists.freedesktop.org
> > To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
  2018-11-21 10:00     ` Christian König
@ 2018-11-27 18:15       ` Kenny Ho
       [not found]         ` <CAOWid-fMFUvT_XQijRd34+cUOxM=zbbf+HwWv_NbqO-rBo2d_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2018-11-27 18:15 UTC (permalink / raw)
  To: christian.koenig; +Cc: Kenny.Ho, amd-gfx, dri-devel

Hey Christian,

Sorry for the late reply, I missed this for some reason.

On Wed, Nov 21, 2018 at 5:00 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > index 370e9a5536ef..531726443104 100644
> > --- a/include/uapi/drm/amdgpu_drm.h
> > +++ b/include/uapi/drm/amdgpu_drm.h
> > @@ -72,6 +72,18 @@ extern "C" {
> >   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >   #define DRM_IOCTL_AMDGPU_SCHED              DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >
> > +enum amdgpu_mem_domain {
> > +     AMDGPU_MEM_DOMAIN_CPU,
> > +     AMDGPU_MEM_DOMAIN_GTT,
> > +     AMDGPU_MEM_DOMAIN_VRAM,
> > +     AMDGPU_MEM_DOMAIN_GDS,
> > +     AMDGPU_MEM_DOMAIN_GWS,
> > +     AMDGPU_MEM_DOMAIN_OA,
> > +     __MAX_AMDGPU_MEM_DOMAIN
> > +};
>
> Well that is a clear NAK since it duplicates the TTM defines. Please use
> that one instead and don't make this UAPI.
This is defined to help with the chunk of changes below.  The
AMDGPU_GEM_DOMAIN* already exists and this is similar to how TTM has
TTM_PL_* to help with the creation of TTM_PL_FLAG_*:
https://elixir.bootlin.com/linux/v4.20-rc4/source/include/drm/ttm/ttm_placement.h#L36

I don't disagree that there is a duplication here but it's
pre-existing so if you can help clarify my confusion that would be
much appreciated.

Reards,
Kenny

> > +
> > +extern char const *amdgpu_mem_domain_names[];
> > +
> >   /**
> >    * DOC: memory domains
> >    *
> > @@ -95,12 +107,12 @@ extern "C" {
> >    * %AMDGPU_GEM_DOMAIN_OA    Ordered append, used by 3D or Compute engines
> >    * for appending data.
> >    */
> > -#define AMDGPU_GEM_DOMAIN_CPU                0x1
> > -#define AMDGPU_GEM_DOMAIN_GTT                0x2
> > -#define AMDGPU_GEM_DOMAIN_VRAM               0x4
> > -#define AMDGPU_GEM_DOMAIN_GDS                0x8
> > -#define AMDGPU_GEM_DOMAIN_GWS                0x10
> > -#define AMDGPU_GEM_DOMAIN_OA         0x20
> > +#define AMDGPU_GEM_DOMAIN_CPU                (1 << AMDGPU_MEM_DOMAIN_CPU)
> > +#define AMDGPU_GEM_DOMAIN_GTT                (1 << AMDGPU_MEM_DOMAIN_GTT)
> > +#define AMDGPU_GEM_DOMAIN_VRAM               (1 << AMDGPU_MEM_DOMAIN_VRAM)
> > +#define AMDGPU_GEM_DOMAIN_GDS                (1 << AMDGPU_MEM_DOMAIN_GDS)
> > +#define AMDGPU_GEM_DOMAIN_GWS                (1 << AMDGPU_MEM_DOMAIN_GWS)
> > +#define AMDGPU_GEM_DOMAIN_OA         (1 << AMDGPU_MEM_DOMAIN_OA)
> >   #define AMDGPU_GEM_DOMAIN_MASK              (AMDGPU_GEM_DOMAIN_CPU | \
> >                                        AMDGPU_GEM_DOMAIN_GTT | \
> >                                        AMDGPU_GEM_DOMAIN_VRAM | \
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
       [not found]         ` <CAOWid-fMFUvT_XQijRd34+cUOxM=zbbf+HwWv_NbqO-rBo2d_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-11-27 20:31           ` Christian König
       [not found]             ` <3299d9d6-e272-0459-8f63-0c81d11cde1e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Christian König @ 2018-11-27 20:31 UTC (permalink / raw)
  To: Kenny Ho, christian.koenig-5C7GfCeVMHo
  Cc: Kenny.Ho-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 27.11.18 um 19:15 schrieb Kenny Ho:
> Hey Christian,
>
> Sorry for the late reply, I missed this for some reason.
>
> On Wed, Nov 21, 2018 at 5:00 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>> index 370e9a5536ef..531726443104 100644
>>> --- a/include/uapi/drm/amdgpu_drm.h
>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>> @@ -72,6 +72,18 @@ extern "C" {
>>>    #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>    #define DRM_IOCTL_AMDGPU_SCHED              DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>
>>> +enum amdgpu_mem_domain {
>>> +     AMDGPU_MEM_DOMAIN_CPU,
>>> +     AMDGPU_MEM_DOMAIN_GTT,
>>> +     AMDGPU_MEM_DOMAIN_VRAM,
>>> +     AMDGPU_MEM_DOMAIN_GDS,
>>> +     AMDGPU_MEM_DOMAIN_GWS,
>>> +     AMDGPU_MEM_DOMAIN_OA,
>>> +     __MAX_AMDGPU_MEM_DOMAIN
>>> +};
>> Well that is a clear NAK since it duplicates the TTM defines. Please use
>> that one instead and don't make this UAPI.
> This is defined to help with the chunk of changes below.  The
> AMDGPU_GEM_DOMAIN* already exists and this is similar to how TTM has
> TTM_PL_* to help with the creation of TTM_PL_FLAG_*:
> https://elixir.bootlin.com/linux/v4.20-rc4/source/include/drm/ttm/ttm_placement.h#L36
>
> I don't disagree that there is a duplication here but it's
> pre-existing so if you can help clarify my confusion that would be
> much appreciated.

The AMDGPU_GEM_DOMAIN are masks which are used in the frontend IOCTL 
interface to create BOs.

TTM defines the backend pools where the memory is then allocated from to 
fill the BOs.

So you are mixing frontend and backend here.

In other words for the whole cgroup interface you should not make a 
single change to amdgpu_drm.h or otherwise you are doing something wrong.

Regards,
Christian.

>
> Reards,
> Kenny
>
>>> +
>>> +extern char const *amdgpu_mem_domain_names[];
>>> +
>>>    /**
>>>     * DOC: memory domains
>>>     *
>>> @@ -95,12 +107,12 @@ extern "C" {
>>>     * %AMDGPU_GEM_DOMAIN_OA    Ordered append, used by 3D or Compute engines
>>>     * for appending data.
>>>     */
>>> -#define AMDGPU_GEM_DOMAIN_CPU                0x1
>>> -#define AMDGPU_GEM_DOMAIN_GTT                0x2
>>> -#define AMDGPU_GEM_DOMAIN_VRAM               0x4
>>> -#define AMDGPU_GEM_DOMAIN_GDS                0x8
>>> -#define AMDGPU_GEM_DOMAIN_GWS                0x10
>>> -#define AMDGPU_GEM_DOMAIN_OA         0x20
>>> +#define AMDGPU_GEM_DOMAIN_CPU                (1 << AMDGPU_MEM_DOMAIN_CPU)
>>> +#define AMDGPU_GEM_DOMAIN_GTT                (1 << AMDGPU_MEM_DOMAIN_GTT)
>>> +#define AMDGPU_GEM_DOMAIN_VRAM               (1 << AMDGPU_MEM_DOMAIN_VRAM)
>>> +#define AMDGPU_GEM_DOMAIN_GDS                (1 << AMDGPU_MEM_DOMAIN_GDS)
>>> +#define AMDGPU_GEM_DOMAIN_GWS                (1 << AMDGPU_MEM_DOMAIN_GWS)
>>> +#define AMDGPU_GEM_DOMAIN_OA         (1 << AMDGPU_MEM_DOMAIN_OA)
>>>    #define AMDGPU_GEM_DOMAIN_MASK              (AMDGPU_GEM_DOMAIN_CPU | \
>>>                                         AMDGPU_GEM_DOMAIN_GTT | \
>>>                                         AMDGPU_GEM_DOMAIN_VRAM | \
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup
       [not found]             ` <3299d9d6-e272-0459-8f63-0c81d11cde1e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-11-27 20:36               ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2018-11-27 20:36 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo
  Cc: Kenny.Ho-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Ah I see.  Thank you for the clarification.

Regards,
Kenny
On Tue, Nov 27, 2018 at 3:31 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 27.11.18 um 19:15 schrieb Kenny Ho:
> > Hey Christian,
> >
> > Sorry for the late reply, I missed this for some reason.
> >
> > On Wed, Nov 21, 2018 at 5:00 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>> index 370e9a5536ef..531726443104 100644
> >>> --- a/include/uapi/drm/amdgpu_drm.h
> >>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>> @@ -72,6 +72,18 @@ extern "C" {
> >>>    #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>    #define DRM_IOCTL_AMDGPU_SCHED              DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>>
> >>> +enum amdgpu_mem_domain {
> >>> +     AMDGPU_MEM_DOMAIN_CPU,
> >>> +     AMDGPU_MEM_DOMAIN_GTT,
> >>> +     AMDGPU_MEM_DOMAIN_VRAM,
> >>> +     AMDGPU_MEM_DOMAIN_GDS,
> >>> +     AMDGPU_MEM_DOMAIN_GWS,
> >>> +     AMDGPU_MEM_DOMAIN_OA,
> >>> +     __MAX_AMDGPU_MEM_DOMAIN
> >>> +};
> >> Well that is a clear NAK since it duplicates the TTM defines. Please use
> >> that one instead and don't make this UAPI.
> > This is defined to help with the chunk of changes below.  The
> > AMDGPU_GEM_DOMAIN* already exists and this is similar to how TTM has
> > TTM_PL_* to help with the creation of TTM_PL_FLAG_*:
> > https://elixir.bootlin.com/linux/v4.20-rc4/source/include/drm/ttm/ttm_placement.h#L36
> >
> > I don't disagree that there is a duplication here but it's
> > pre-existing so if you can help clarify my confusion that would be
> > much appreciated.
>
> The AMDGPU_GEM_DOMAIN are masks which are used in the frontend IOCTL
> interface to create BOs.
>
> TTM defines the backend pools where the memory is then allocated from to
> fill the BOs.
>
> So you are mixing frontend and backend here.
>
> In other words for the whole cgroup interface you should not make a
> single change to amdgpu_drm.h or otherwise you are doing something wrong.
>
> Regards,
> Christian.
>
> >
> > Reards,
> > Kenny
> >
> >>> +
> >>> +extern char const *amdgpu_mem_domain_names[];
> >>> +
> >>>    /**
> >>>     * DOC: memory domains
> >>>     *
> >>> @@ -95,12 +107,12 @@ extern "C" {
> >>>     * %AMDGPU_GEM_DOMAIN_OA    Ordered append, used by 3D or Compute engines
> >>>     * for appending data.
> >>>     */
> >>> -#define AMDGPU_GEM_DOMAIN_CPU                0x1
> >>> -#define AMDGPU_GEM_DOMAIN_GTT                0x2
> >>> -#define AMDGPU_GEM_DOMAIN_VRAM               0x4
> >>> -#define AMDGPU_GEM_DOMAIN_GDS                0x8
> >>> -#define AMDGPU_GEM_DOMAIN_GWS                0x10
> >>> -#define AMDGPU_GEM_DOMAIN_OA         0x20
> >>> +#define AMDGPU_GEM_DOMAIN_CPU                (1 << AMDGPU_MEM_DOMAIN_CPU)
> >>> +#define AMDGPU_GEM_DOMAIN_GTT                (1 << AMDGPU_MEM_DOMAIN_GTT)
> >>> +#define AMDGPU_GEM_DOMAIN_VRAM               (1 << AMDGPU_MEM_DOMAIN_VRAM)
> >>> +#define AMDGPU_GEM_DOMAIN_GDS                (1 << AMDGPU_MEM_DOMAIN_GDS)
> >>> +#define AMDGPU_GEM_DOMAIN_GWS                (1 << AMDGPU_MEM_DOMAIN_GWS)
> >>> +#define AMDGPU_GEM_DOMAIN_OA         (1 << AMDGPU_MEM_DOMAIN_OA)
> >>>    #define AMDGPU_GEM_DOMAIN_MASK              (AMDGPU_GEM_DOMAIN_CPU | \
> >>>                                         AMDGPU_GEM_DOMAIN_GTT | \
> >>>                                         AMDGPU_GEM_DOMAIN_VRAM | \
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-11-27 15:41                       ` Ho, Kenny
@ 2018-11-28  9:14                         ` Joonas Lahtinen
  2018-11-28 19:46                           ` Ho, Kenny
       [not found]                           ` <154339645444.5339.6291298808444340104-zzJjBcU1GAT9BXuAQUXR0fooFf0ArEBIu+b9c/7xato@public.gmane.org>
  0 siblings, 2 replies; 80+ messages in thread
From: Joonas Lahtinen @ 2018-11-28  9:14 UTC (permalink / raw)
  To: Ho, Kenny, Kasiviswanathan, Harish, Tejun Heo
  Cc: cgroups, intel-gfx, y2kenny, amd-gfx, dri-devel

Quoting Ho, Kenny (2018-11-27 17:41:17)
> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> > I think a more abstract property "% of GPU (processing power)" might
> > be a more universal approach. One can then implement that through
> > subdividing the resources or timeslicing them, depending on the GPU
> > topology.
> >
> > Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
> > applicable to cloud provider usecases, too. At least that's what I
> > see done for the CPUs today.
> I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.

I think the ask in return to the early series from Intal was to agree
on the variables that could be common to all of DRM subsystem.

So we can only choose the lowest common denominator, right?

Any core count out of total core count should translate nicely into a
fraction, so what would be the problem with percentage amounts?

Regards, Joonas

> 
> Regards,
> Kenny
> 
> > That combined with the "GPU memory usable" property should be a good
> > starting point to start subdividing the GPU resources for multiple
> > users.
> >
> > Regards, Joonas
> >
> > >
> > > Your feedback is highly appreciated.
> > >
> > > Best Regards,
> > > Harish
> > >
> > >
> > >
> > > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> > > Sent: Tuesday, November 20, 2018 5:30 PM
> > > To: Ho, Kenny
> > > Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > > Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> > >
> > >
> > > Hello,
> > >
> > > On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > > > By this reply, are you suggesting that vendor specific resources
> > > > will never be acceptable to be managed under cgroup?  Let say a user
> > >
> > > I wouldn't say never but whatever which gets included as a cgroup
> > > controller should have clearly defined resource abstractions and the
> > > control schemes around them including support for delegation.  AFAICS,
> > > gpu side still seems to have a long way to go (and it's not clear
> > > whether that's somewhere it will or needs to end up).
> > >
> > > > want to have similar functionality as what cgroup is offering but to
> > > > manage vendor specific resources, what would you suggest as a
> > > > solution?  When you say keeping vendor specific resource regulation
> > > > inside drm or specific drivers, do you mean we should replicate the
> > > > cgroup infrastructure there or do you mean either drm or specific
> > > > driver should query existing hierarchy (such as device or perhaps
> > > > cpu) for the process organization information?
> > > >
> > > > To put the questions in more concrete terms, let say a user wants to
> > > > expose certain part of a gpu to a particular cgroup similar to the
> > > > way selective cpu cores are exposed to a cgroup via cpuset, how
> > > > should we go about enabling such functionality?
> > >
> > > Do what the intel driver or bpf is doing?  It's not difficult to hook
> > > into cgroup for identification purposes.
> > >
> > > Thanks.
> > >
> > > --
> > > tejun
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > >
> > >
> > > amd-gfx Info Page - freedesktop.org
> > > lists.freedesktop.org
> > > To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-11-28  9:14                         ` Joonas Lahtinen
@ 2018-11-28 19:46                           ` Ho, Kenny
  2018-11-30 22:22                             ` Matt Roper
       [not found]                           ` <154339645444.5339.6291298808444340104-zzJjBcU1GAT9BXuAQUXR0fooFf0ArEBIu+b9c/7xato@public.gmane.org>
  1 sibling, 1 reply; 80+ messages in thread
From: Ho, Kenny @ 2018-11-28 19:46 UTC (permalink / raw)
  To: Joonas Lahtinen, Kasiviswanathan, Harish, Tejun Heo
  Cc: cgroups, intel-gfx, y2kenny, amd-gfx, dri-devel


On Wed, Nov 28, 2018 at 4:14 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> So we can only choose the lowest common denominator, right?
>
> Any core count out of total core count should translate nicely into a
> fraction, so what would be the problem with percentage amounts?

I don't think having an abstracted resource necessarily equate 'lowest'.  The issue with percentage is the lack of precision.  If you look at cpuset cgroup, you can see the specification can be very precise:

# /bin/echo 1-4,6 > cpuset.cpus -> set cpus list to cpus 1,2,3,4,6
(https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt)

The driver can translate something like this to core count and then to percentage and handle accordingly while the reverse is not possible.  (You can't tell which set of CUs/EUs a user want from a percentage request.)  It's also not clear to me, from user/application/admin/resource management perspective, how the base core counts of a GPU is relevant to the workload (since percentage is a 'relative' quantity.)  For example, let say a workload wants to use 256 'cores', does it matter if that workload is put on a GPU with 1024 cores or a GPU with 4096 cores total?

I am not dismissing the possible need for percentage.  I just think there should be a way to accommodate more than just the 'lowest'. 

Regards,
Kennny


> > > That combined with the "GPU memory usable" property should be a good
> > > starting point to start subdividing the GPU resources for multiple
> > > users.
> > >
> > > Regards, Joonas
> > >
> > > >
> > > > Your feedback is highly appreciated.
> > > >
> > > > Best Regards,
> > > > Harish
> > > >
> > > >
> > > >
> > > > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> > > > Sent: Tuesday, November 20, 2018 5:30 PM
> > > > To: Ho, Kenny
> > > > Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > > > Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> > > >
> > > >
> > > > Hello,
> > > >
> > > > On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > > > > By this reply, are you suggesting that vendor specific resources
> > > > > will never be acceptable to be managed under cgroup?  Let say a user
> > > >
> > > > I wouldn't say never but whatever which gets included as a cgroup
> > > > controller should have clearly defined resource abstractions and the
> > > > control schemes around them including support for delegation.  AFAICS,
> > > > gpu side still seems to have a long way to go (and it's not clear
> > > > whether that's somewhere it will or needs to end up).
> > > >
> > > > > want to have similar functionality as what cgroup is offering but to
> > > > > manage vendor specific resources, what would you suggest as a
> > > > > solution?  When you say keeping vendor specific resource regulation
> > > > > inside drm or specific drivers, do you mean we should replicate the
> > > > > cgroup infrastructure there or do you mean either drm or specific
> > > > > driver should query existing hierarchy (such as device or perhaps
> > > > > cpu) for the process organization information?
> > > > >
> > > > > To put the questions in more concrete terms, let say a user wants to
> > > > > expose certain part of a gpu to a particular cgroup similar to the
> > > > > way selective cpu cores are exposed to a cgroup via cpuset, how
> > > > > should we go about enabling such functionality?
> > > >
> > > > Do what the intel driver or bpf is doing?  It's not difficult to hook
> > > > into cgroup for identification purposes.
> > > >
> > > > Thanks.
> > > >
> > > > --
> > > > tejun
> > > > _______________________________________________
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > >
> > > >
> > > > amd-gfx Info Page - freedesktop.org
> > > > lists.freedesktop.org
> > > > To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> > > >
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-11-28 19:46                           ` Ho, Kenny
@ 2018-11-30 22:22                             ` Matt Roper
       [not found]                               ` <20181130222228.GE31345-b/RNqDZ/lqH1fpGqjiHozbKMmGWinSIL2HeeBUIffwg@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Matt Roper @ 2018-11-30 22:22 UTC (permalink / raw)
  To: Ho, Kenny; +Cc: intel-gfx, dri-devel, y2kenny, amd-gfx, Tejun Heo, cgroups

On Wed, Nov 28, 2018 at 07:46:06PM +0000, Ho, Kenny wrote:
> 
> On Wed, Nov 28, 2018 at 4:14 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> > So we can only choose the lowest common denominator, right?
> >
> > Any core count out of total core count should translate nicely into a
> > fraction, so what would be the problem with percentage amounts?
> 
> I don't think having an abstracted resource necessarily equate
> 'lowest'.  The issue with percentage is the lack of precision.  If you
> look at cpuset cgroup, you can see the specification can be very
> precise:
> 
> # /bin/echo 1-4,6 > cpuset.cpus -> set cpus list to cpus 1,2,3,4,6
> (https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt)
> 
> The driver can translate something like this to core count and then to
> percentage and handle accordingly while the reverse is not possible.
> (You can't tell which set of CUs/EUs a user want from a percentage
> request.)  It's also not clear to me, from
> user/application/admin/resource management perspective, how the base
> core counts of a GPU is relevant to the workload (since percentage is
> a 'relative' quantity.)  For example, let say a workload wants to use
> 256 'cores', does it matter if that workload is put on a GPU with 1024
> cores or a GPU with 4096 cores total?
> 
> I am not dismissing the possible need for percentage.  I just think
> there should be a way to accommodate more than just the 'lowest'. 
> 

As you noted, your proposal is similar to the cgroup-v1 "cpuset"
controller, which is sort of a way of partitioning your underlying
hardware resources; I think Joonas is describing something closer in
design to the cgroup-v2 "cpu" controller, which partitions the general
time/usage allocated to via cgroup; afaiu, "cpu" doesn't really care
which specific core the tasks run on, just the relative weights that
determine how much time they get to run on any of the cores.

It sounds like with your hardware, your kernel driver is able to specify
exactly which subset of GPU EU's a specific GPU context winds up running
on.  However I think there are a lot of platforms that don't allow that
kind of low-level control.  E.g., I don't think we can do that on Intel
hardware; we have a handful of high-level GPU engines that we can submit
different types of batchbuffers to (render, blitter, media, etc.).  What
we can do is use GPU preemption to limit how much time specific GPU
contexts get to run on the render engine before the engine is reclaimed
for use by a different context.

Using a %gputime approach like Joonas is suggesting could be handled in
a driver by reserving specific subsets of EU's on hardware like yours
that's capable of doing that, whereas it could be mostly handled on
other types of hardware via GPU engine preemption.

I think either approach "gpu_euset" or "%gputime" should map well to a
cgroup controller implementation.  Granted, neither one solves the
specific use case I was working on earlier this year where we need
unfair (starvation-okay) scheduling that will run contexts strictly
according to priority (i.e., lower priority contexts will never run at
all unless all higher priority contexts have completed all of their
submitted work), but that's a pretty specialized use case that we'll
probably need to handle in a different manner anyway.


Matt


> Regards,
> Kennny
> 
> 
> > > > That combined with the "GPU memory usable" property should be a good
> > > > starting point to start subdividing the GPU resources for multiple
> > > > users.
> > > >
> > > > Regards, Joonas
> > > >
> > > > >
> > > > > Your feedback is highly appreciated.
> > > > >
> > > > > Best Regards,
> > > > > Harish
> > > > >
> > > > >
> > > > >
> > > > > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> > > > > Sent: Tuesday, November 20, 2018 5:30 PM
> > > > > To: Ho, Kenny
> > > > > Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > > > > Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> > > > >
> > > > >
> > > > > Hello,
> > > > >
> > > > > On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > > > > > By this reply, are you suggesting that vendor specific resources
> > > > > > will never be acceptable to be managed under cgroup?  Let say a user
> > > > >
> > > > > I wouldn't say never but whatever which gets included as a cgroup
> > > > > controller should have clearly defined resource abstractions and the
> > > > > control schemes around them including support for delegation.  AFAICS,
> > > > > gpu side still seems to have a long way to go (and it's not clear
> > > > > whether that's somewhere it will or needs to end up).
> > > > >
> > > > > > want to have similar functionality as what cgroup is offering but to
> > > > > > manage vendor specific resources, what would you suggest as a
> > > > > > solution?  When you say keeping vendor specific resource regulation
> > > > > > inside drm or specific drivers, do you mean we should replicate the
> > > > > > cgroup infrastructure there or do you mean either drm or specific
> > > > > > driver should query existing hierarchy (such as device or perhaps
> > > > > > cpu) for the process organization information?
> > > > > >
> > > > > > To put the questions in more concrete terms, let say a user wants to
> > > > > > expose certain part of a gpu to a particular cgroup similar to the
> > > > > > way selective cpu cores are exposed to a cgroup via cpuset, how
> > > > > > should we go about enabling such functionality?
> > > > >
> > > > > Do what the intel driver or bpf is doing?  It's not difficult to hook
> > > > > into cgroup for identification purposes.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > --
> > > > > tejun
> > > > > _______________________________________________
> > > > > amd-gfx mailing list
> > > > > amd-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > > >
> > > > >
> > > > > amd-gfx Info Page - freedesktop.org
> > > > > lists.freedesktop.org
> > > > > To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> > > > >
> > > > > _______________________________________________
> > > > > Intel-gfx mailing list
> > > > > Intel-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]                               ` <20181130222228.GE31345-b/RNqDZ/lqH1fpGqjiHozbKMmGWinSIL2HeeBUIffwg@public.gmane.org>
@ 2018-12-03  6:46                                 ` Ho, Kenny
  2018-12-03 18:58                                   ` Matt Roper
  0 siblings, 1 reply; 80+ messages in thread
From: Ho, Kenny @ 2018-12-03  6:46 UTC (permalink / raw)
  To: Matt Roper
  Cc: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Joonas Lahtinen,
	Kasiviswanathan,  Harish,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

Hey Matt,

On Fri, Nov 30, 2018 at 5:22 PM Matt Roper <matthew.d.roper@intel.com> wrote:
> I think Joonas is describing something closer in
> design to the cgroup-v2 "cpu" controller, which partitions the general
> time/usage allocated to via cgroup; afaiu, "cpu" doesn't really care
> which specific core the tasks run on, just the relative weights that
> determine how much time they get to run on any of the cores.

Depending on the level of optimization one wants to do, I think people care about which cpu core a task runs on.  Modern processors are no longer a monolithic 'thing'.  At least for AMD, there are multiple cpus on a core complex (CCX), multiple CCX on a die, and multiple dies on a processor.  A task running on cpu 0 and cpu 1 on die 0 will behave very differently from a task running on core 0s on die 0 and die 1 on the same socket.  (https://en.wikichip.org/wiki/amd/microarchitectures/zen#Die-die_memory_latencies)

It's not just an AMD thing either.  Here is an open issue on Intel's architecture:
https://github.com/kubernetes/kubernetes/issues/67355

and a proposed solution using cpu affinity https://github.com/kubernetes/community/blob/630acc487c80e4981a232cdd8400eb8207119788/keps/sig-node/0030-qos-class-cpu-affinity.md#proposal (by one of your colleagues.)

The time-based sharing below is also something we are thinking about, but it's personally not as exciting as the resource-based sharing for me because the time-share use case has already been addressed by our SRIOV/virtualization products.  We can potentially have different level of time sharing using cgroup though (in addition to SRIOV), potentially trading efficiency against isolation.  That said, I think the time-based approach maybe orthogonal to the resource-based approach (orthogonal in the sense that both are needed depending on the usage.)

Regards,
Kenny


> It sounds like with your hardware, your kernel driver is able to specify
> exactly which subset of GPU EU's a specific GPU context winds up running
> on.  However I think there are a lot of platforms that don't allow that
> kind of low-level control.  E.g., I don't think we can do that on Intel
> hardware; we have a handful of high-level GPU engines that we can submit
> different types of batchbuffers to (render, blitter, media, etc.).  What
> we can do is use GPU preemption to limit how much time specific GPU
> contexts get to run on the render engine before the engine is reclaimed
> for use by a different context.
>
> Using a %gputime approach like Joonas is suggesting could be handled in
> a driver by reserving specific subsets of EU's on hardware like yours
> that's capable of doing that, whereas it could be mostly handled on
> other types of hardware via GPU engine preemption.
>
> I think either approach "gpu_euset" or "%gputime" should map well to a
> cgroup controller implementation.  Granted, neither one solves the
> specific use case I was working on earlier this year where we need
> unfair (starvation-okay) scheduling that will run contexts strictly
> according to priority (i.e., lower priority contexts will never run at
> all unless all higher priority contexts have completed all of their
> submitted work), but that's a pretty specialized use case that we'll
> probably need to handle in a different manner anyway.
>
>
> Matt
>
>
> > Regards,
> > Kennny
> >
> >
> > > > > That combined with the "GPU memory usable" property should be a good
> > > > > starting point to start subdividing the GPU resources for multiple
> > > > > users.
> > > > >
> > > > > Regards, Joonas
> > > > >
> > > > > >
> > > > > > Your feedback is highly appreciated.
> > > > > >
> > > > > > Best Regards,
> > > > > > Harish
> > > > > >
> > > > > >
> > > > > >
> > > > > > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> > > > > > Sent: Tuesday, November 20, 2018 5:30 PM
> > > > > > To: Ho, Kenny
> > > > > > Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > > > > > Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> > > > > >
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > > > > > > By this reply, are you suggesting that vendor specific resources
> > > > > > > will never be acceptable to be managed under cgroup?  Let say a user
> > > > > >
> > > > > > I wouldn't say never but whatever which gets included as a cgroup
> > > > > > controller should have clearly defined resource abstractions and the
> > > > > > control schemes around them including support for delegation.  AFAICS,
> > > > > > gpu side still seems to have a long way to go (and it's not clear
> > > > > > whether that's somewhere it will or needs to end up).
> > > > > >
> > > > > > > want to have similar functionality as what cgroup is offering but to
> > > > > > > manage vendor specific resources, what would you suggest as a
> > > > > > > solution?  When you say keeping vendor specific resource regulation
> > > > > > > inside drm or specific drivers, do you mean we should replicate the
> > > > > > > cgroup infrastructure there or do you mean either drm or specific
> > > > > > > driver should query existing hierarchy (such as device or perhaps
> > > > > > > cpu) for the process organization information?
> > > > > > >
> > > > > > > To put the questions in more concrete terms, let say a user wants to
> > > > > > > expose certain part of a gpu to a particular cgroup similar to the
> > > > > > > way selective cpu cores are exposed to a cgroup via cpuset, how
> > > > > > > should we go about enabling such functionality?
> > > > > >
> > > > > > Do what the intel driver or bpf is doing?  It's not difficult to hook
> > > > > > into cgroup for identification purposes.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > --
> > > > > > tejun
> > > > > > _______________________________________________
> > > > > > amd-gfx mailing list
> > > > > > amd-gfx@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > > > >
> > > > > >
> > > > > > amd-gfx Info Page - freedesktop.org
> > > > > > lists.freedesktop.org
> > > > > > To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> > > > > >
> > > > > > _______________________________________________
> > > > > > Intel-gfx mailing list
> > > > > > Intel-gfx@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Matt Roper
> Graphics Software Engineer
> IoTG Platform Enabling & Development
> Intel Corporation
> (916) 356-2795
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
  2018-12-03  6:46                                 ` [Intel-gfx] " Ho, Kenny
@ 2018-12-03 18:58                                   ` Matt Roper
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Roper @ 2018-12-03 18:58 UTC (permalink / raw)
  To: Ho, Kenny
  Cc: intel-gfx, Kasiviswanathan, Harish, dri-devel, y2kenny, amd-gfx,
	Tejun Heo, cgroups

On Mon, Dec 03, 2018 at 06:46:01AM +0000, Ho, Kenny wrote:
> Hey Matt,
> 
> On Fri, Nov 30, 2018 at 5:22 PM Matt Roper <matthew.d.roper@intel.com> wrote:
> > I think Joonas is describing something closer in
> > design to the cgroup-v2 "cpu" controller, which partitions the general
> > time/usage allocated to via cgroup; afaiu, "cpu" doesn't really care
> > which specific core the tasks run on, just the relative weights that
> > determine how much time they get to run on any of the cores.
> 
> Depending on the level of optimization one wants to do, I think people
> care about which cpu core a task runs on.  Modern processors are no
> longer a monolithic 'thing'.  At least for AMD, there are multiple
> cpus on a core complex (CCX), multiple CCX on a die, and multiple dies
> on a processor.  A task running on cpu 0 and cpu 1 on die 0 will
> behave very differently from a task running on core 0s on die 0 and
> die 1 on the same socket.
> (https://en.wikichip.org/wiki/amd/microarchitectures/zen#Die-die_memory_latencies)
> 
> It's not just an AMD thing either.  Here is an open issue on Intel's architecture:
> https://github.com/kubernetes/kubernetes/issues/67355
> 
> and a proposed solution using cpu affinity
> https://github.com/kubernetes/community/blob/630acc487c80e4981a232cdd8400eb8207119788/keps/sig-node/0030-qos-class-cpu-affinity.md#proposal
> (by one of your colleagues.)

Right, I didn't mean to imply that the use case wasn't valid, I was just
referring to how I believe the cgroup-v2 'cpu' controller (i.e.,
cpu_cgrp_subsys) currently behaves, as a contrast to the behavior of the
cgroup-v1 'cpuset' controller.  I can definitely understand your
motivation for wanting something along the lines of a "gpuset"
controller, but as far as I know, that just isn't something that's
possible to implement on a lot of GPU's.

> 
> The time-based sharing below is also something we are thinking about,
> but it's personally not as exciting as the resource-based sharing for
> me because the time-share use case has already been addressed by our
> SRIOV/virtualization products.  We can potentially have different
> level of time sharing using cgroup though (in addition to SRIOV),
> potentially trading efficiency against isolation.  That said, I think
> the time-based approach maybe orthogonal to the resource-based
> approach (orthogonal in the sense that both are needed depending on
> the usage.)

Makes sense.


Matt


> 
> Regards,
> Kenny
> 
> 
> > It sounds like with your hardware, your kernel driver is able to specify
> > exactly which subset of GPU EU's a specific GPU context winds up running
> > on.  However I think there are a lot of platforms that don't allow that
> > kind of low-level control.  E.g., I don't think we can do that on Intel
> > hardware; we have a handful of high-level GPU engines that we can submit
> > different types of batchbuffers to (render, blitter, media, etc.).  What
> > we can do is use GPU preemption to limit how much time specific GPU
> > contexts get to run on the render engine before the engine is reclaimed
> > for use by a different context.
> >
> > Using a %gputime approach like Joonas is suggesting could be handled in
> > a driver by reserving specific subsets of EU's on hardware like yours
> > that's capable of doing that, whereas it could be mostly handled on
> > other types of hardware via GPU engine preemption.
> >
> > I think either approach "gpu_euset" or "%gputime" should map well to a
> > cgroup controller implementation.  Granted, neither one solves the
> > specific use case I was working on earlier this year where we need
> > unfair (starvation-okay) scheduling that will run contexts strictly
> > according to priority (i.e., lower priority contexts will never run at
> > all unless all higher priority contexts have completed all of their
> > submitted work), but that's a pretty specialized use case that we'll
> > probably need to handle in a different manner anyway.
> >
> >
> > Matt
> >
> >
> > > Regards,
> > > Kennny
> > >
> > >
> > > > > > That combined with the "GPU memory usable" property should be a good
> > > > > > starting point to start subdividing the GPU resources for multiple
> > > > > > users.
> > > > > >
> > > > > > Regards, Joonas
> > > > > >
> > > > > > >
> > > > > > > Your feedback is highly appreciated.
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Harish
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> > > > > > > Sent: Tuesday, November 20, 2018 5:30 PM
> > > > > > > To: Ho, Kenny
> > > > > > > Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > > > > > > Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> > > > > > >
> > > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> > > > > > > > By this reply, are you suggesting that vendor specific resources
> > > > > > > > will never be acceptable to be managed under cgroup?  Let say a user
> > > > > > >
> > > > > > > I wouldn't say never but whatever which gets included as a cgroup
> > > > > > > controller should have clearly defined resource abstractions and the
> > > > > > > control schemes around them including support for delegation.  AFAICS,
> > > > > > > gpu side still seems to have a long way to go (and it's not clear
> > > > > > > whether that's somewhere it will or needs to end up).
> > > > > > >
> > > > > > > > want to have similar functionality as what cgroup is offering but to
> > > > > > > > manage vendor specific resources, what would you suggest as a
> > > > > > > > solution?  When you say keeping vendor specific resource regulation
> > > > > > > > inside drm or specific drivers, do you mean we should replicate the
> > > > > > > > cgroup infrastructure there or do you mean either drm or specific
> > > > > > > > driver should query existing hierarchy (such as device or perhaps
> > > > > > > > cpu) for the process organization information?
> > > > > > > >
> > > > > > > > To put the questions in more concrete terms, let say a user wants to
> > > > > > > > expose certain part of a gpu to a particular cgroup similar to the
> > > > > > > > way selective cpu cores are exposed to a cgroup via cpuset, how
> > > > > > > > should we go about enabling such functionality?
> > > > > > >
> > > > > > > Do what the intel driver or bpf is doing?  It's not difficult to hook
> > > > > > > into cgroup for identification purposes.
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > --
> > > > > > > tejun
> > > > > > > _______________________________________________
> > > > > > > amd-gfx mailing list
> > > > > > > amd-gfx@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > > > > >
> > > > > > >
> > > > > > > amd-gfx Info Page - freedesktop.org
> > > > > > > lists.freedesktop.org
> > > > > > > To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Intel-gfx mailing list
> > > > > > > Intel-gfx@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Matt Roper
> > Graphics Software Engineer
> > IoTG Platform Enabling & Development
> > Intel Corporation
> > (916) 356-2795

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]                           ` <154339645444.5339.6291298808444340104-zzJjBcU1GAT9BXuAQUXR0fooFf0ArEBIu+b9c/7xato@public.gmane.org>
@ 2018-12-03 20:55                               ` Kuehling, Felix
  0 siblings, 0 replies; 80+ messages in thread
From: Kuehling, Felix @ 2018-12-03 20:55 UTC (permalink / raw)
  To: Joonas Lahtinen, Ho, Kenny, Kasiviswanathan, Harish, Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote:
> Quoting Ho, Kenny (2018-11-27 17:41:17)
>> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
>>> I think a more abstract property "% of GPU (processing power)" might
>>> be a more universal approach. One can then implement that through
>>> subdividing the resources or timeslicing them, depending on the GPU
>>> topology.
>>>
>>> Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
>>> applicable to cloud provider usecases, too. At least that's what I
>>> see done for the CPUs today.
>> I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.
> I think the ask in return to the early series from Intal was to agree
> on the variables that could be common to all of DRM subsystem.
>
> So we can only choose the lowest common denominator, right?
>
> Any core count out of total core count should translate nicely into a
> fraction, so what would be the problem with percentage amounts?
How would you handle overcommitment with a percentage? That is, more
than 100% of the GPU cores assigned to cgroups. Which cgroups end up
sharing cores would be up to chance.

If we allow specifying a set of GPU cores, we can be more specific in
assigning and sharing resources between cgroups.

Regards,
  Felix


>
> Regards, Joonas
>
>> Regards,
>> Kenny
>>
>>> That combined with the "GPU memory usable" property should be a good
>>> starting point to start subdividing the GPU resources for multiple
>>> users.
>>>
>>> Regards, Joonas
>>>
>>>> Your feedback is highly appreciated.
>>>>
>>>> Best Regards,
>>>> Harish
>>>>
>>>>
>>>>
>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
>>>> Sent: Tuesday, November 20, 2018 5:30 PM
>>>> To: Ho, Kenny
>>>> Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>>> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
>>>>
>>>>
>>>> Hello,
>>>>
>>>> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
>>>>> By this reply, are you suggesting that vendor specific resources
>>>>> will never be acceptable to be managed under cgroup?  Let say a user
>>>> I wouldn't say never but whatever which gets included as a cgroup
>>>> controller should have clearly defined resource abstractions and the
>>>> control schemes around them including support for delegation.  AFAICS,
>>>> gpu side still seems to have a long way to go (and it's not clear
>>>> whether that's somewhere it will or needs to end up).
>>>>
>>>>> want to have similar functionality as what cgroup is offering but to
>>>>> manage vendor specific resources, what would you suggest as a
>>>>> solution?  When you say keeping vendor specific resource regulation
>>>>> inside drm or specific drivers, do you mean we should replicate the
>>>>> cgroup infrastructure there or do you mean either drm or specific
>>>>> driver should query existing hierarchy (such as device or perhaps
>>>>> cpu) for the process organization information?
>>>>>
>>>>> To put the questions in more concrete terms, let say a user wants to
>>>>> expose certain part of a gpu to a particular cgroup similar to the
>>>>> way selective cpu cores are exposed to a cgroup via cpuset, how
>>>>> should we go about enabling such functionality?
>>>> Do what the intel driver or bpf is doing?  It's not difficult to hook
>>>> into cgroup for identification purposes.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> tejun
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>>>
>>>> amd-gfx Info Page - freedesktop.org
>>>> lists.freedesktop.org
>>>> To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
@ 2018-12-03 20:55                               ` Kuehling, Felix
  0 siblings, 0 replies; 80+ messages in thread
From: Kuehling, Felix @ 2018-12-03 20:55 UTC (permalink / raw)
  To: Joonas Lahtinen, Ho, Kenny, Kasiviswanathan, Harish, Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote:
> Quoting Ho, Kenny (2018-11-27 17:41:17)
>> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
>>> I think a more abstract property "% of GPU (processing power)" might
>>> be a more universal approach. One can then implement that through
>>> subdividing the resources or timeslicing them, depending on the GPU
>>> topology.
>>>
>>> Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
>>> applicable to cloud provider usecases, too. At least that's what I
>>> see done for the CPUs today.
>> I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.
> I think the ask in return to the early series from Intal was to agree
> on the variables that could be common to all of DRM subsystem.
>
> So we can only choose the lowest common denominator, right?
>
> Any core count out of total core count should translate nicely into a
> fraction, so what would be the problem with percentage amounts?
How would you handle overcommitment with a percentage? That is, more
than 100% of the GPU cores assigned to cgroups. Which cgroups end up
sharing cores would be up to chance.

If we allow specifying a set of GPU cores, we can be more specific in
assigning and sharing resources between cgroups.

Regards,
  Felix


>
> Regards, Joonas
>
>> Regards,
>> Kenny
>>
>>> That combined with the "GPU memory usable" property should be a good
>>> starting point to start subdividing the GPU resources for multiple
>>> users.
>>>
>>> Regards, Joonas
>>>
>>>> Your feedback is highly appreciated.
>>>>
>>>> Best Regards,
>>>> Harish
>>>>
>>>>
>>>>
>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
>>>> Sent: Tuesday, November 20, 2018 5:30 PM
>>>> To: Ho, Kenny
>>>> Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>>> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
>>>>
>>>>
>>>> Hello,
>>>>
>>>> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
>>>>> By this reply, are you suggesting that vendor specific resources
>>>>> will never be acceptable to be managed under cgroup?  Let say a user
>>>> I wouldn't say never but whatever which gets included as a cgroup
>>>> controller should have clearly defined resource abstractions and the
>>>> control schemes around them including support for delegation.  AFAICS,
>>>> gpu side still seems to have a long way to go (and it's not clear
>>>> whether that's somewhere it will or needs to end up).
>>>>
>>>>> want to have similar functionality as what cgroup is offering but to
>>>>> manage vendor specific resources, what would you suggest as a
>>>>> solution?  When you say keeping vendor specific resource regulation
>>>>> inside drm or specific drivers, do you mean we should replicate the
>>>>> cgroup infrastructure there or do you mean either drm or specific
>>>>> driver should query existing hierarchy (such as device or perhaps
>>>>> cpu) for the process organization information?
>>>>>
>>>>> To put the questions in more concrete terms, let say a user wants to
>>>>> expose certain part of a gpu to a particular cgroup similar to the
>>>>> way selective cpu cores are exposed to a cgroup via cpuset, how
>>>>> should we go about enabling such functionality?
>>>> Do what the intel driver or bpf is doing?  It's not difficult to hook
>>>> into cgroup for identification purposes.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> tejun
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>>>
>>>> amd-gfx Info Page - freedesktop.org
>>>> lists.freedesktop.org
>>>> To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
       [not found]                               ` <219f8754-3e14-05ad-07a3-6cddb8bb74aa-5C7GfCeVMHo@public.gmane.org>
@ 2018-12-05 14:20                                   ` Joonas Lahtinen
  0 siblings, 0 replies; 80+ messages in thread
From: Joonas Lahtinen @ 2018-12-05 14:20 UTC (permalink / raw)
  To: Ho, Kenny, Kasiviswanathan, Harish, Kuehling, Felix, Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Quoting Kuehling, Felix (2018-12-03 22:55:16)
> 
> On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote:
> > Quoting Ho, Kenny (2018-11-27 17:41:17)
> >> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> >>> I think a more abstract property "% of GPU (processing power)" might
> >>> be a more universal approach. One can then implement that through
> >>> subdividing the resources or timeslicing them, depending on the GPU
> >>> topology.
> >>>
> >>> Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
> >>> applicable to cloud provider usecases, too. At least that's what I
> >>> see done for the CPUs today.
> >> I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.
> > I think the ask in return to the early series from Intal was to agree
> > on the variables that could be common to all of DRM subsystem.
> >
> > So we can only choose the lowest common denominator, right?
> >
> > Any core count out of total core count should translate nicely into a
> > fraction, so what would be the problem with percentage amounts?
> How would you handle overcommitment with a percentage? That is, more
> than 100% of the GPU cores assigned to cgroups. Which cgroups end up
> sharing cores would be up to chance.

I see your point. With time-slicing, you really can't overcommit. So would
assume that there would have to be second level of detail provided for
overcommitting (and deciding which cgroups are to share GPU cores).

> If we allow specifying a set of GPU cores, we can be more specific in
> assigning and sharing resources between cgroups.

As Matt outlined in the other reply to this thread, we don't really have
the concept of GPU cores. We do have the command streamers, but the
granularity is bit low.

In your architecture, does it matter which specific cores are shared, or
is it just a question of which specific cgroups would share some cores
in case of overcommit?

If we tack in the priority in addition to the percentage, you could make
a choice to share cores only at an identical priority level only. That'd
mean that in the case of overcommit, you'd aim to keep as many high
priority levels free of overcommit as possible and then for lower
priority cgroups you'd start overcommitting.

Would that even partially address the concern?

Regards, Joonas

> 
> Regards,
>   Felix
> 
> 
> >
> > Regards, Joonas
> >
> >> Regards,
> >> Kenny
> >>
> >>> That combined with the "GPU memory usable" property should be a good
> >>> starting point to start subdividing the GPU resources for multiple
> >>> users.
> >>>
> >>> Regards, Joonas
> >>>
> >>>> Your feedback is highly appreciated.
> >>>>
> >>>> Best Regards,
> >>>> Harish
> >>>>
> >>>>
> >>>>
> >>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> >>>> Sent: Tuesday, November 20, 2018 5:30 PM
> >>>> To: Ho, Kenny
> >>>> Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> >>>> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> >>>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> >>>>> By this reply, are you suggesting that vendor specific resources
> >>>>> will never be acceptable to be managed under cgroup?  Let say a user
> >>>> I wouldn't say never but whatever which gets included as a cgroup
> >>>> controller should have clearly defined resource abstractions and the
> >>>> control schemes around them including support for delegation.  AFAICS,
> >>>> gpu side still seems to have a long way to go (and it's not clear
> >>>> whether that's somewhere it will or needs to end up).
> >>>>
> >>>>> want to have similar functionality as what cgroup is offering but to
> >>>>> manage vendor specific resources, what would you suggest as a
> >>>>> solution?  When you say keeping vendor specific resource regulation
> >>>>> inside drm or specific drivers, do you mean we should replicate the
> >>>>> cgroup infrastructure there or do you mean either drm or specific
> >>>>> driver should query existing hierarchy (such as device or perhaps
> >>>>> cpu) for the process organization information?
> >>>>>
> >>>>> To put the questions in more concrete terms, let say a user wants to
> >>>>> expose certain part of a gpu to a particular cgroup similar to the
> >>>>> way selective cpu cores are exposed to a cgroup via cpuset, how
> >>>>> should we go about enabling such functionality?
> >>>> Do what the intel driver or bpf is doing?  It's not difficult to hook
> >>>> into cgroup for identification purposes.
> >>>>
> >>>> Thanks.
> >>>>
> >>>> --
> >>>> tejun
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >>>>
> >>>>
> >>>> amd-gfx Info Page - freedesktop.org
> >>>> lists.freedesktop.org
> >>>> To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> >>>>
> >>>> _______________________________________________
> >>>> Intel-gfx mailing list
> >>>> Intel-gfx@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
@ 2018-12-05 14:20                                   ` Joonas Lahtinen
  0 siblings, 0 replies; 80+ messages in thread
From: Joonas Lahtinen @ 2018-12-05 14:20 UTC (permalink / raw)
  To: Ho, Kenny, Kasiviswanathan, Harish, Kuehling, Felix, Tejun Heo
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Quoting Kuehling, Felix (2018-12-03 22:55:16)
> 
> On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote:
> > Quoting Ho, Kenny (2018-11-27 17:41:17)
> >> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> >>> I think a more abstract property "% of GPU (processing power)" might
> >>> be a more universal approach. One can then implement that through
> >>> subdividing the resources or timeslicing them, depending on the GPU
> >>> topology.
> >>>
> >>> Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
> >>> applicable to cloud provider usecases, too. At least that's what I
> >>> see done for the CPUs today.
> >> I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.
> > I think the ask in return to the early series from Intal was to agree
> > on the variables that could be common to all of DRM subsystem.
> >
> > So we can only choose the lowest common denominator, right?
> >
> > Any core count out of total core count should translate nicely into a
> > fraction, so what would be the problem with percentage amounts?
> How would you handle overcommitment with a percentage? That is, more
> than 100% of the GPU cores assigned to cgroups. Which cgroups end up
> sharing cores would be up to chance.

I see your point. With time-slicing, you really can't overcommit. So would
assume that there would have to be second level of detail provided for
overcommitting (and deciding which cgroups are to share GPU cores).

> If we allow specifying a set of GPU cores, we can be more specific in
> assigning and sharing resources between cgroups.

As Matt outlined in the other reply to this thread, we don't really have
the concept of GPU cores. We do have the command streamers, but the
granularity is bit low.

In your architecture, does it matter which specific cores are shared, or
is it just a question of which specific cgroups would share some cores
in case of overcommit?

If we tack in the priority in addition to the percentage, you could make
a choice to share cores only at an identical priority level only. That'd
mean that in the case of overcommit, you'd aim to keep as many high
priority levels free of overcommit as possible and then for lower
priority cgroups you'd start overcommitting.

Would that even partially address the concern?

Regards, Joonas

> 
> Regards,
>   Felix
> 
> 
> >
> > Regards, Joonas
> >
> >> Regards,
> >> Kenny
> >>
> >>> That combined with the "GPU memory usable" property should be a good
> >>> starting point to start subdividing the GPU resources for multiple
> >>> users.
> >>>
> >>> Regards, Joonas
> >>>
> >>>> Your feedback is highly appreciated.
> >>>>
> >>>> Best Regards,
> >>>> Harish
> >>>>
> >>>>
> >>>>
> >>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Tejun Heo <tj@kernel.org>
> >>>> Sent: Tuesday, November 20, 2018 5:30 PM
> >>>> To: Ho, Kenny
> >>>> Cc: cgroups@vger.kernel.org; intel-gfx@lists.freedesktop.org; y2kenny@gmail.com; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> >>>> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> >>>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> >>>>> By this reply, are you suggesting that vendor specific resources
> >>>>> will never be acceptable to be managed under cgroup?  Let say a user
> >>>> I wouldn't say never but whatever which gets included as a cgroup
> >>>> controller should have clearly defined resource abstractions and the
> >>>> control schemes around them including support for delegation.  AFAICS,
> >>>> gpu side still seems to have a long way to go (and it's not clear
> >>>> whether that's somewhere it will or needs to end up).
> >>>>
> >>>>> want to have similar functionality as what cgroup is offering but to
> >>>>> manage vendor specific resources, what would you suggest as a
> >>>>> solution?  When you say keeping vendor specific resource regulation
> >>>>> inside drm or specific drivers, do you mean we should replicate the
> >>>>> cgroup infrastructure there or do you mean either drm or specific
> >>>>> driver should query existing hierarchy (such as device or perhaps
> >>>>> cpu) for the process organization information?
> >>>>>
> >>>>> To put the questions in more concrete terms, let say a user wants to
> >>>>> expose certain part of a gpu to a particular cgroup similar to the
> >>>>> way selective cpu cores are exposed to a cgroup via cpuset, how
> >>>>> should we go about enabling such functionality?
> >>>> Do what the intel driver or bpf is doing?  It's not difficult to hook
> >>>> into cgroup for identification purposes.
> >>>>
> >>>> Thanks.
> >>>>
> >>>> --
> >>>> tejun
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >>>>
> >>>>
> >>>> amd-gfx Info Page - freedesktop.org
> >>>> lists.freedesktop.org
> >>>> To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your existing subscription, in the sections below.
> >>>>
> >>>> _______________________________________________
> >>>> Intel-gfx mailing list
> >>>> Intel-gfx@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem
  2018-11-20 18:58 [PATCH RFC 0/5] DRM cgroup controller Kenny Ho
                   ` (3 preceding siblings ...)
  2018-11-21  1:43 ` ✗ Fi.CI.BAT: failure for DRM cgroup controller Patchwork
@ 2019-05-09 21:04 ` Kenny Ho
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
                     ` (2 more replies)
  4 siblings, 3 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-09 21:04 UTC (permalink / raw)
  To: y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, tj, sunnanyong,
	alexander.deucher, brian.welty

This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.

Usage examples:
// set limit for card1 to 1GB
sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max

// set limit for card0 to 512MB
sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max


v2:
* Removed the vendoring concepts
* Add limit to total buffer allocation
* Add limit to the maximum size of a buffer allocation

TODO: process migration
TODO: documentations

[a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html

v1: cover letter

The purpose of this patch series is to start a discussion for a generic cgroup
controller for the drm subsystem.  The design proposed here is a very early one.
We are hoping to engage the community as we develop the idea.


Backgrounds
==========
Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
tasks, and all their future children, into hierarchical groups with specialized
behaviour, such as accounting/limiting the resources which processes in a cgroup
can access[1].  Weights, limits, protections, allocations are the main resource
distribution models.  Existing cgroup controllers includes cpu, memory, io,
rdma, and more.  cgroup is one of the foundational technologies that enables the
popular container application deployment and management method.

Direct Rendering Manager/drm contains code intended to support the needs of
complex graphics devices. Graphics drivers in the kernel may make use of DRM
functions to make tasks like memory management, interrupt handling and DMA
easier, and provide a uniform interface to applications.  The DRM has also
developed beyond traditional graphics applications to support compute/GPGPU
applications.


Motivations
=========
As GPU grow beyond the realm of desktop/workstation graphics into areas like
data center clusters and IoT, there are increasing needs to monitor and regulate
GPU as a resource like cpu, memory and io.

Matt Roper from Intel began working on similar idea in early 2018 [2] for the
purpose of managing GPU priority using the cgroup hierarchy.  While that
particular use case may not warrant a standalone drm cgroup controller, there
are other use cases where having one can be useful [3].  Monitoring GPU
resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
(execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
sysadmins get a better understanding of the applications usage profile.  Further
usage regulations of the aforementioned resources can also help sysadmins
optimize workload deployment on limited GPU resources.

With the increased importance of machine learning, data science and other
cloud-based applications, GPUs are already in production use in data centers
today [5,6,7].  Existing GPU resource management is very course grain, however,
as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
alternative is to use GPU virtualization (with or without SRIOV) but it
generally acts on the entire GPU instead of the specific resources in a GPU.
With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
resource management (in addition to what may be available via GPU
virtualization.)

In addition to production use, the DRM cgroup can also help with testing
graphics application robustness by providing a mean to artificially limit DRM
resources availble to the applications.

Challenges
========
While there are common infrastructure in DRM that is shared across many vendors
(the scheduler [4] for example), there are also aspects of DRM that are vendor
specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
handle different kinds of cgroup controller.

Resources for DRM are also often device (GPU) specific instead of system
specific and a system may contain more than one GPU.  For this, we borrowed some
of the ideas from RDMA cgroup controller.

Approach
=======
To experiment with the idea of a DRM cgroup, we would like to start with basic
accounting and statistics, then continue to iterate and add regulating
mechanisms into the driver.

[1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
[2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
[3] https://www.spinics.net/lists/cgroups/msg20720.html
[4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
[5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
[6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
[7] https://github.com/RadeonOpenCompute/k8s-device-plugin
[8] https://github.com/kubernetes/kubernetes/issues/52757

Kenny Ho (5):
  cgroup: Introduce cgroup for drm subsystem
  cgroup: Add mechanism to register DRM devices
  drm/amdgpu: Register AMD devices for DRM cgroup
  drm, cgroup: Add total GEM buffer allocation limit
  drm, cgroup: Add peak GEM buffer allocation limit

 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
 drivers/gpu/drm/drm_gem.c                  |   7 +
 drivers/gpu/drm/drm_prime.c                |   9 +
 include/drm/drm_cgroup.h                   |  54 +++
 include/drm/drm_gem.h                      |  11 +
 include/linux/cgroup_drm.h                 |  47 ++
 include/linux/cgroup_subsys.h              |   4 +
 init/Kconfig                               |   5 +
 kernel/cgroup/Makefile                     |   1 +
 kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
 11 files changed, 643 insertions(+)
 create mode 100644 include/drm/drm_cgroup.h
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC PATCH v2 1/5] cgroup: Introduce cgroup for drm subsystem
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2019-05-09 21:04     ` Kenny Ho
  2019-05-09 21:04     ` [RFC PATCH v2 2/5] cgroup: Add mechanism to register DRM devices Kenny Ho
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-09 21:04 UTC (permalink / raw)
  To: y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tj-DgEjT+Ai2ygdnm+yROfE0A, sunnanyong-hv44wF8Li93QT0dZR+AlfA,
	alexander.deucher-5C7GfCeVMHo,
	brian.welty-ral2JQCrhuEAvxtiuMwx3w

Change-Id: I6830d3990f63f0c13abeba29b1d330cf28882831
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 include/linux/cgroup_drm.h    | 32 ++++++++++++++++++++++++++
 include/linux/cgroup_subsys.h |  4 ++++
 init/Kconfig                  |  5 +++++
 kernel/cgroup/Makefile        |  1 +
 kernel/cgroup/drm.c           | 42 +++++++++++++++++++++++++++++++++++
 5 files changed, 84 insertions(+)
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
new file mode 100644
index 000000000000..121001be1230
--- /dev/null
+++ b/include/linux/cgroup_drm.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+#ifndef _CGROUP_DRM_H
+#define _CGROUP_DRM_H
+
+#ifdef CONFIG_CGROUP_DRM
+
+#include <linux/cgroup.h>
+
+struct drmcgrp {
+	struct cgroup_subsys_state	css;
+};
+
+static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct drmcgrp, css) : NULL;
+}
+
+static inline struct drmcgrp *get_drmcgrp(struct task_struct *task)
+{
+	return css_drmcgrp(task_get_css(task, drm_cgrp_id));
+}
+
+
+static inline struct drmcgrp *parent_drmcgrp(struct drmcgrp *cg)
+{
+	return css_drmcgrp(cg->css.parent);
+}
+
+#endif	/* CONFIG_CGROUP_DRM */
+#endif	/* _CGROUP_DRM_H */
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index acb77dcff3b4..ddedad809e8b 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -61,6 +61,10 @@ SUBSYS(pids)
 SUBSYS(rdma)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_DRM)
+SUBSYS(drm)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index d47cb77a220e..0b0f112eb23b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -839,6 +839,11 @@ config CGROUP_RDMA
 	  Attaching processes with active RDMA resources to the cgroup
 	  hierarchy is allowed even if can cross the hierarchy's limit.
 
+config CGROUP_DRM
+	bool "DRM controller (EXPERIMENTAL)"
+	help
+	  Provides accounting and enforcement of resources in the DRM subsystem.
+
 config CGROUP_FREEZER
 	bool "Freezer controller"
 	help
diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index bfcdae896122..6af14bd93050 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -4,5 +4,6 @@ obj-y := cgroup.o rstat.o namespace.o cgroup-v1.o
 obj-$(CONFIG_CGROUP_FREEZER) += freezer.o
 obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
+obj-$(CONFIG_CGROUP_DRM) += drm.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_CGROUP_DEBUG) += debug.o
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
new file mode 100644
index 000000000000..620c887d6d24
--- /dev/null
+++ b/kernel/cgroup/drm.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: MIT
+// Copyright 2019 Advanced Micro Devices, Inc.
+#include <linux/slab.h>
+#include <linux/cgroup.h>
+#include <linux/cgroup_drm.h>
+
+static struct drmcgrp *root_drmcgrp __read_mostly;
+
+static void drmcgrp_css_free(struct cgroup_subsys_state *css)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(css);
+
+	kfree(css_drmcgrp(css));
+}
+
+static struct cgroup_subsys_state *
+drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
+{
+	struct drmcgrp *parent = css_drmcgrp(parent_css);
+	struct drmcgrp *drmcgrp;
+
+	drmcgrp = kzalloc(sizeof(struct drmcgrp), GFP_KERNEL);
+	if (!drmcgrp)
+		return ERR_PTR(-ENOMEM);
+
+	if (!parent)
+		root_drmcgrp = drmcgrp;
+
+	return &drmcgrp->css;
+}
+
+struct cftype files[] = {
+	{ }	/* terminate */
+};
+
+struct cgroup_subsys drm_cgrp_subsys = {
+	.css_alloc	= drmcgrp_css_alloc,
+	.css_free	= drmcgrp_css_free,
+	.early_init	= false,
+	.legacy_cftypes	= files,
+	.dfl_cftypes	= files,
+};
-- 
2.21.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH v2 2/5] cgroup: Add mechanism to register DRM devices
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2019-05-09 21:04     ` [RFC PATCH v2 1/5] cgroup: Introduce cgroup for drm subsystem Kenny Ho
@ 2019-05-09 21:04     ` Kenny Ho
  2019-05-09 21:04     ` [RFC PATCH v2 5/5] drm, cgroup: Add peak GEM buffer allocation limit Kenny Ho
  2019-05-10 12:31     ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Christian König
  3 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-09 21:04 UTC (permalink / raw)
  To: y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tj-DgEjT+Ai2ygdnm+yROfE0A, sunnanyong-hv44wF8Li93QT0dZR+AlfA,
	alexander.deucher-5C7GfCeVMHo,
	brian.welty-ral2JQCrhuEAvxtiuMwx3w

Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 include/drm/drm_cgroup.h   |  24 ++++++++
 include/linux/cgroup_drm.h |  10 ++++
 kernel/cgroup/drm.c        | 118 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 151 insertions(+), 1 deletion(-)
 create mode 100644 include/drm/drm_cgroup.h

diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
new file mode 100644
index 000000000000..ddb9eab64360
--- /dev/null
+++ b/include/drm/drm_cgroup.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+#ifndef __DRM_CGROUP_H__
+#define __DRM_CGROUP_H__
+
+#ifdef CONFIG_CGROUP_DRM
+
+int drmcgrp_register_device(struct drm_device *device);
+
+int drmcgrp_unregister_device(struct drm_device *device);
+
+#else
+static inline int drmcgrp_register_device(struct drm_device *device)
+{
+	return 0;
+}
+
+static inline int drmcgrp_unregister_device(struct drm_device *device)
+{
+	return 0;
+}
+#endif /* CONFIG_CGROUP_DRM */
+#endif /* __DRM_CGROUP_H__ */
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 121001be1230..d7ccf434ca6b 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -6,10 +6,20 @@
 
 #ifdef CONFIG_CGROUP_DRM
 
+#include <linux/mutex.h>
 #include <linux/cgroup.h>
+#include <drm/drm_file.h>
+
+/* limit defined per the way drm_minor_alloc operates */
+#define MAX_DRM_DEV (64 * DRM_MINOR_RENDER)
+
+struct drmcgrp_device_resource {
+	/* for per device stats */
+};
 
 struct drmcgrp {
 	struct cgroup_subsys_state	css;
+	struct drmcgrp_device_resource	*dev_resources[MAX_DRM_DEV];
 };
 
 static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index 620c887d6d24..f9ef4bf042d8 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -1,16 +1,79 @@
 // SPDX-License-Identifier: MIT
 // Copyright 2019 Advanced Micro Devices, Inc.
+#include <linux/export.h>
 #include <linux/slab.h>
 #include <linux/cgroup.h>
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+#include <linux/mutex.h>
 #include <linux/cgroup_drm.h>
+#include <drm/drm_device.h>
+#include <drm/drm_cgroup.h>
+
+static DEFINE_MUTEX(drmcgrp_mutex);
+
+struct drmcgrp_device {
+	struct drm_device	*dev;
+	struct mutex		mutex;
+};
+
+/* indexed by drm_minor for access speed */
+static struct drmcgrp_device	*known_drmcgrp_devs[MAX_DRM_DEV];
+
+static int max_minor;
+
 
 static struct drmcgrp *root_drmcgrp __read_mostly;
 
 static void drmcgrp_css_free(struct cgroup_subsys_state *css)
 {
 	struct drmcgrp *drmcgrp = css_drmcgrp(css);
+	int i;
+
+	for (i = 0; i <= max_minor; i++) {
+		if (drmcgrp->dev_resources[i] != NULL)
+			kfree(drmcgrp->dev_resources[i]);
+	}
+
+	kfree(drmcgrp);
+}
+
+static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
+{
+	struct drmcgrp_device_resource *ddr = drmcgrp->dev_resources[i];
+
+	if (ddr == NULL) {
+		ddr = kzalloc(sizeof(struct drmcgrp_device_resource),
+			GFP_KERNEL);
+
+		if (!ddr)
+			return -ENOMEM;
+
+		drmcgrp->dev_resources[i] = ddr;
+	}
+
+	/* set defaults here */
+
+	return 0;
+}
+
+static inline int init_drmcgrp(struct drmcgrp *drmcgrp, struct drm_device *dev)
+{
+	int rc = 0;
+	int i;
+
+	if (dev != NULL) {
+		rc = init_drmcgrp_single(drmcgrp, dev->primary->index);
+		return rc;
+	}
+
+	for (i = 0; i <= max_minor; i++) {
+		rc = init_drmcgrp_single(drmcgrp, i);
+		if (rc)
+			return rc;
+	}
 
-	kfree(css_drmcgrp(css));
+	return 0;
 }
 
 static struct cgroup_subsys_state *
@@ -18,11 +81,18 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct drmcgrp *parent = css_drmcgrp(parent_css);
 	struct drmcgrp *drmcgrp;
+	int rc;
 
 	drmcgrp = kzalloc(sizeof(struct drmcgrp), GFP_KERNEL);
 	if (!drmcgrp)
 		return ERR_PTR(-ENOMEM);
 
+	rc = init_drmcgrp(drmcgrp, NULL);
+	if (rc) {
+		drmcgrp_css_free(&drmcgrp->css);
+		return ERR_PTR(rc);
+	}
+
 	if (!parent)
 		root_drmcgrp = drmcgrp;
 
@@ -40,3 +110,49 @@ struct cgroup_subsys drm_cgrp_subsys = {
 	.legacy_cftypes	= files,
 	.dfl_cftypes	= files,
 };
+
+int drmcgrp_register_device(struct drm_device *dev)
+{
+	struct drmcgrp_device *ddev;
+	struct cgroup_subsys_state *pos;
+	struct drmcgrp *child;
+
+	ddev = kzalloc(sizeof(struct drmcgrp_device), GFP_KERNEL);
+	if (!ddev)
+		return -ENOMEM;
+
+	ddev->dev = dev;
+	mutex_init(&ddev->mutex);
+
+	mutex_lock(&drmcgrp_mutex);
+	known_drmcgrp_devs[dev->primary->index] = ddev;
+	max_minor = max(max_minor, dev->primary->index);
+	mutex_unlock(&drmcgrp_mutex);
+
+	/* init cgroups created before registration (i.e. root cgroup) */
+	if (root_drmcgrp != NULL) {
+		init_drmcgrp(root_drmcgrp, dev);
+
+		rcu_read_lock();
+		css_for_each_child(pos, &root_drmcgrp->css) {
+			child = css_drmcgrp(pos);
+			init_drmcgrp(child, dev);
+		}
+		rcu_read_unlock();
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(drmcgrp_register_device);
+
+int drmcgrp_unregister_device(struct drm_device *dev)
+{
+	mutex_lock(&drmcgrp_mutex);
+
+	kfree(known_drmcgrp_devs[dev->primary->index]);
+	known_drmcgrp_devs[dev->primary->index] = NULL;
+
+	mutex_unlock(&drmcgrp_mutex);
+	return 0;
+}
+EXPORT_SYMBOL(drmcgrp_unregister_device);
-- 
2.21.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH v2 3/5] drm/amdgpu: Register AMD devices for DRM cgroup
  2019-05-09 21:04 ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Kenny Ho
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2019-05-09 21:04   ` Kenny Ho
  2019-05-09 21:04   ` [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit Kenny Ho
  2 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-09 21:04 UTC (permalink / raw)
  To: y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, tj, sunnanyong,
	alexander.deucher, brian.welty

Change-Id: I3750fc657b956b52750a36cb303c54fa6a265b44
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index da7b4fe8ade3..2568fd730161 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -28,6 +28,7 @@
 #include <drm/drmP.h>
 #include "amdgpu.h"
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_cgroup.h>
 #include "amdgpu_sched.h"
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
@@ -97,6 +98,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
 
 	amdgpu_device_fini(adev);
 
+	drmcgrp_unregister_device(dev);
 done_free:
 	kfree(adev);
 	dev->dev_private = NULL;
@@ -141,6 +143,8 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
 	struct amdgpu_device *adev;
 	int r, acpi_status;
 
+	drmcgrp_register_device(dev);
+
 #ifdef CONFIG_DRM_AMDGPU_SI
 	if (!amdgpu_si_support) {
 		switch (flags & AMD_ASIC_MASK) {
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-09 21:04 ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Kenny Ho
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2019-05-09 21:04   ` [RFC PATCH v2 3/5] drm/amdgpu: Register AMD devices for DRM cgroup Kenny Ho
@ 2019-05-09 21:04   ` Kenny Ho
       [not found]     ` <20190509210410.5471-5-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2019-05-15 21:26     ` Welty, Brian
  2 siblings, 2 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-09 21:04 UTC (permalink / raw)
  To: y2kenny, Kenny.Ho, cgroups, dri-devel, amd-gfx, tj, sunnanyong,
	alexander.deucher, brian.welty

The drm resource being measured and limited here is the GEM buffer
objects.  User applications allocate and free these buffers.  In
addition, a process can allocate a buffer and share it with another
process.  The consumer of a shared buffer can also outlive the
allocator of the buffer.

For the purpose of cgroup accounting and limiting, ownership of the
buffer is deemed to be the cgroup for which the allocating process
belongs to.  There is one limit per drm device.

In order to prevent the buffer outliving the cgroup that owns it, a
process is prevented from importing buffers that are not own by the
process' cgroup or the ancestors of the process' cgroup.

For this resource, the control files are prefixed with drm.buffer.total.

There are four control file types,
stats (ro) - display current measured values for a resource
max (rw) - limits for a resource
default (ro, root cgroup only) - default values for a resource
help (ro, root cgroup only) - help string for a resource

Each file is multi-lined with one entry/line per drm device.

Usage examples:
// set limit for card1 to 1GB
sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max

// set limit for card0 to 512MB
sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max

Change-Id: I4c249d06d45ec709d6481d4cbe87c5168545c5d0
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
 drivers/gpu/drm/drm_gem.c                  |   7 +
 drivers/gpu/drm/drm_prime.c                |   9 +
 include/drm/drm_cgroup.h                   |  34 ++-
 include/drm/drm_gem.h                      |  11 +
 include/linux/cgroup_drm.h                 |   3 +
 kernel/cgroup/drm.c                        | 280 +++++++++++++++++++++
 7 files changed, 346 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 93b2c5a48a71..b4c078b7ad63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -34,6 +34,7 @@
 #include <drm/drmP.h>
 #include <drm/amdgpu_drm.h>
 #include <drm/drm_cache.h>
+#include <drm/drm_cgroup.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -446,6 +447,9 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 	if (!amdgpu_bo_validate_size(adev, size, bp->domain))
 		return -ENOMEM;
 
+	if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
+		return -ENOMEM;
+
 	*bo_ptr = NULL;
 
 	acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 6a80db077dc6..cbd49bf34dcf 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -37,10 +37,12 @@
 #include <linux/shmem_fs.h>
 #include <linux/dma-buf.h>
 #include <linux/mem_encrypt.h>
+#include <linux/cgroup_drm.h>
 #include <drm/drmP.h>
 #include <drm/drm_vma_manager.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_print.h>
+#include <drm/drm_cgroup.h>
 #include "drm_internal.h"
 
 /** @file drm_gem.c
@@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
 	obj->handle_count = 0;
 	obj->size = size;
 	drm_vma_node_reset(&obj->vma_node);
+
+	obj->drmcgrp = get_drmcgrp(current);
+	drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
 }
 EXPORT_SYMBOL(drm_gem_private_object_init);
 
@@ -804,6 +809,8 @@ drm_gem_object_release(struct drm_gem_object *obj)
 	if (obj->filp)
 		fput(obj->filp);
 
+	drmcgrp_unchg_bo_alloc(obj->drmcgrp, obj->dev, obj->size);
+
 	drm_gem_free_mmap_offset(obj);
 }
 EXPORT_SYMBOL(drm_gem_object_release);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 231e3f6d5f41..faed5611a1c6 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -32,6 +32,7 @@
 #include <drm/drm_prime.h>
 #include <drm/drm_gem.h>
 #include <drm/drmP.h>
+#include <drm/drm_cgroup.h>
 
 #include "drm_internal.h"
 
@@ -794,6 +795,7 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
 {
 	struct dma_buf *dma_buf;
 	struct drm_gem_object *obj;
+	struct drmcgrp *drmcgrp = get_drmcgrp(current);
 	int ret;
 
 	dma_buf = dma_buf_get(prime_fd);
@@ -818,6 +820,13 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
 		goto out_unlock;
 	}
 
+	/* only allow bo from the same cgroup or its ancestor to be imported */
+	if (drmcgrp != NULL &&
+			!drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
+		ret = -EACCES;
+		goto out_unlock;
+	}
+
 	if (obj->dma_buf) {
 		WARN_ON(obj->dma_buf != dma_buf);
 	} else {
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index ddb9eab64360..8711b7c5f7bf 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -4,12 +4,20 @@
 #ifndef __DRM_CGROUP_H__
 #define __DRM_CGROUP_H__
 
+#include <linux/cgroup_drm.h>
+
 #ifdef CONFIG_CGROUP_DRM
 
 int drmcgrp_register_device(struct drm_device *device);
-
 int drmcgrp_unregister_device(struct drm_device *device);
-
+bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
+		struct drmcgrp *relative);
+void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size);
+void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size);
+bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
+		size_t size);
 #else
 static inline int drmcgrp_register_device(struct drm_device *device)
 {
@@ -20,5 +28,27 @@ static inline int drmcgrp_unregister_device(struct drm_device *device)
 {
 	return 0;
 }
+
+static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
+		struct drmcgrp *relative)
+{
+	return false;
+}
+
+static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
+		struct drm_device *dev,	size_t size)
+{
+}
+
+static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
+		struct drm_device *dev,	size_t size)
+{
+}
+
+static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
+		struct drm_device *dev,	size_t size)
+{
+	return true;
+}
 #endif /* CONFIG_CGROUP_DRM */
 #endif /* __DRM_CGROUP_H__ */
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index c95727425284..02854c674b5c 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -272,6 +272,17 @@ struct drm_gem_object {
 	 *
 	 */
 	const struct drm_gem_object_funcs *funcs;
+
+	/**
+	 * @drmcgrp:
+	 *
+	 * DRM cgroup this GEM object belongs to.
+         *
+         * This is used to track and limit the amount of GEM objects a user
+         * can allocate.  Since GEM objects can be shared, this is also used
+         * to ensure GEM objects are only shared within the same cgroup.
+	 */
+	struct drmcgrp *drmcgrp;
 };
 
 /**
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index d7ccf434ca6b..fe14ba7bb1cf 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -15,6 +15,9 @@
 
 struct drmcgrp_device_resource {
 	/* for per device stats */
+	s64			bo_stats_total_allocated;
+
+	s64			bo_limits_total_allocated;
 };
 
 struct drmcgrp {
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index f9ef4bf042d8..bc3abff09113 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -15,6 +15,22 @@ static DEFINE_MUTEX(drmcgrp_mutex);
 struct drmcgrp_device {
 	struct drm_device	*dev;
 	struct mutex		mutex;
+
+	s64			bo_limits_total_allocated_default;
+};
+
+#define DRMCG_CTF_PRIV_SIZE 3
+#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
+
+enum drmcgrp_res_type {
+	DRMCGRP_TYPE_BO_TOTAL,
+};
+
+enum drmcgrp_file_type {
+	DRMCGRP_FTYPE_STATS,
+	DRMCGRP_FTYPE_MAX,
+	DRMCGRP_FTYPE_DEFAULT,
+	DRMCGRP_FTYPE_HELP,
 };
 
 /* indexed by drm_minor for access speed */
@@ -53,6 +69,10 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
 	}
 
 	/* set defaults here */
+	if (known_drmcgrp_devs[i] != NULL) {
+		ddr->bo_limits_total_allocated =
+		  known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
+	}
 
 	return 0;
 }
@@ -99,7 +119,187 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 	return &drmcgrp->css;
 }
 
+static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
+		struct seq_file *sf, enum drmcgrp_res_type type)
+{
+	if (ddr == NULL) {
+		seq_puts(sf, "\n");
+		return;
+	}
+
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
+		struct seq_file *sf, enum drmcgrp_res_type type)
+{
+	if (ddr == NULL) {
+		seq_puts(sf, "\n");
+		return;
+	}
+
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
+		struct seq_file *sf, enum drmcgrp_res_type type)
+{
+	if (ddev == NULL) {
+		seq_puts(sf, "\n");
+		return;
+	}
+
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
+		enum drmcgrp_res_type type)
+{
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf,
+		"Total amount of buffer allocation in bytes for card%d\n",
+		cardNum);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+int drmcgrp_bo_show(struct seq_file *sf, void *v)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
+	struct drmcgrp_device_resource *ddr = NULL;
+	enum drmcgrp_file_type f_type = seq_cft(sf)->
+		private & DRMCG_CTF_PRIV_MASK;
+	enum drmcgrp_res_type type = seq_cft(sf)->
+		private >> DRMCG_CTF_PRIV_SIZE;
+	struct drmcgrp_device *ddev;
+	int i;
+
+	for (i = 0; i <= max_minor; i++) {
+		ddr = drmcgrp->dev_resources[i];
+		ddev = known_drmcgrp_devs[i];
+
+		switch (f_type) {
+		case DRMCGRP_FTYPE_STATS:
+			drmcgrp_print_stats(ddr, sf, type);
+			break;
+		case DRMCGRP_FTYPE_MAX:
+			drmcgrp_print_limits(ddr, sf, type);
+			break;
+		case DRMCGRP_FTYPE_DEFAULT:
+			drmcgrp_print_default(ddev, sf, type);
+			break;
+		case DRMCGRP_FTYPE_HELP:
+			drmcgrp_print_help(i, sf, type);
+			break;
+		default:
+			seq_puts(sf, "\n");
+			break;
+		}
+	}
+
+	return 0;
+}
+
+ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
+		size_t nbytes, loff_t off)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
+	enum drmcgrp_res_type type = of_cft(of)->private >> DRMCG_CTF_PRIV_SIZE;
+	char *cft_name = of_cft(of)->name;
+	char *limits = strstrip(buf);
+	struct drmcgrp_device_resource *ddr;
+	char *sval;
+	s64 val;
+	int i = 0;
+	int rc;
+
+	while (i <= max_minor && limits != NULL) {
+		sval =  strsep(&limits, "\n");
+		rc = kstrtoll(sval, 0, &val);
+
+		if (rc) {
+			pr_err("drmcgrp: %s: minor %d, err %d. ",
+				cft_name, i, rc);
+			pr_cont_cgroup_name(drmcgrp->css.cgroup);
+			pr_cont("\n");
+		} else {
+			ddr = drmcgrp->dev_resources[i];
+			switch (type) {
+			case DRMCGRP_TYPE_BO_TOTAL:
+                                if (val < 0) continue;
+				ddr->bo_limits_total_allocated = val;
+				break;
+			default:
+				break;
+			}
+		}
+
+		i++;
+	}
+
+	if (i <= max_minor) {
+		pr_err("drmcgrp: %s: less entries than # of drm devices. ",
+				cft_name);
+		pr_cont_cgroup_name(drmcgrp->css.cgroup);
+		pr_cont("\n");
+	}
+
+	return nbytes;
+}
+
 struct cftype files[] = {
+	{
+		.name = "buffer.total.stats",
+		.seq_show = drmcgrp_bo_show,
+		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_STATS,
+	},
+	{
+		.name = "buffer.total.default",
+		.seq_show = drmcgrp_bo_show,
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_DEFAULT,
+	},
+	{
+		.name = "buffer.total.help",
+		.seq_show = drmcgrp_bo_show,
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_HELP,
+	},
+	{
+		.name = "buffer.total.max",
+		.write = drmcgrp_bo_limit_write,
+		.seq_show = drmcgrp_bo_show,
+		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_MAX,
+	},
 	{ }	/* terminate */
 };
 
@@ -122,6 +322,8 @@ int drmcgrp_register_device(struct drm_device *dev)
 		return -ENOMEM;
 
 	ddev->dev = dev;
+	ddev->bo_limits_total_allocated_default = S64_MAX;
+
 	mutex_init(&ddev->mutex);
 
 	mutex_lock(&drmcgrp_mutex);
@@ -156,3 +358,81 @@ int drmcgrp_unregister_device(struct drm_device *dev)
 	return 0;
 }
 EXPORT_SYMBOL(drmcgrp_unregister_device);
+
+bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
+{
+	for (; self != NULL; self = parent_drmcgrp(self))
+		if (self == relative)
+			return true;
+
+	return false;
+}
+EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
+
+bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
+		size_t size)
+{
+	struct drmcgrp *drmcgrp = get_drmcgrp(task);
+	struct drmcgrp_device_resource *ddr;
+	struct drmcgrp_device_resource *d;
+	int devIdx = dev->primary->index;
+	bool result = true;
+	s64 delta = 0;
+
+	if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
+		return true;
+
+	ddr = drmcgrp->dev_resources[devIdx];
+	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
+	for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
+		d = drmcgrp->dev_resources[devIdx];
+		delta = d->bo_limits_total_allocated -
+				d->bo_stats_total_allocated;
+
+		if (delta <= 0 || size > delta) {
+			result = false;
+			break;
+		}
+	}
+	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
+
+	return result;
+}
+EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
+
+void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size)
+{
+	struct drmcgrp_device_resource *ddr;
+	int devIdx = dev->primary->index;
+
+	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
+		return;
+
+	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
+	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
+		ddr = drmcgrp->dev_resources[devIdx];
+
+		ddr->bo_stats_total_allocated += (s64)size;
+	}
+	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
+}
+EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
+
+void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size)
+{
+	struct drmcgrp_device_resource *ddr;
+	int devIdx = dev->primary->index;
+
+	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
+		return;
+
+	ddr = drmcgrp->dev_resources[devIdx];
+	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
+	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
+		drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
+			-= (s64)size;
+	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
+}
+EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH v2 5/5] drm, cgroup: Add peak GEM buffer allocation limit
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2019-05-09 21:04     ` [RFC PATCH v2 1/5] cgroup: Introduce cgroup for drm subsystem Kenny Ho
  2019-05-09 21:04     ` [RFC PATCH v2 2/5] cgroup: Add mechanism to register DRM devices Kenny Ho
@ 2019-05-09 21:04     ` Kenny Ho
       [not found]       ` <20190509210410.5471-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
  2019-05-10 12:31     ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Christian König
  3 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2019-05-09 21:04 UTC (permalink / raw)
  To: y2kenny-Re5JQEeQqe8AvxtiuMwx3w, Kenny.Ho-5C7GfCeVMHo,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tj-DgEjT+Ai2ygdnm+yROfE0A, sunnanyong-hv44wF8Li93QT0dZR+AlfA,
	alexander.deucher-5C7GfCeVMHo,
	brian.welty-ral2JQCrhuEAvxtiuMwx3w

This new drmcgrp resource limits the largest GEM buffer that can be
allocated in a cgroup.

Change-Id: I0830d56775568e1cf215b56cc892d5e7945e9f25
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 include/linux/cgroup_drm.h |  2 ++
 kernel/cgroup/drm.c        | 59 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index fe14ba7bb1cf..57c07a148975 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -16,8 +16,10 @@
 struct drmcgrp_device_resource {
 	/* for per device stats */
 	s64			bo_stats_total_allocated;
+	size_t			bo_stats_peak_allocated;
 
 	s64			bo_limits_total_allocated;
+	size_t			bo_limits_peak_allocated;
 };
 
 struct drmcgrp {
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index bc3abff09113..5c7e1b8059ce 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -17,6 +17,7 @@ struct drmcgrp_device {
 	struct mutex		mutex;
 
 	s64			bo_limits_total_allocated_default;
+	size_t			bo_limits_peak_allocated_default;
 };
 
 #define DRMCG_CTF_PRIV_SIZE 3
@@ -24,6 +25,7 @@ struct drmcgrp_device {
 
 enum drmcgrp_res_type {
 	DRMCGRP_TYPE_BO_TOTAL,
+	DRMCGRP_TYPE_BO_PEAK,
 };
 
 enum drmcgrp_file_type {
@@ -72,6 +74,9 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
 	if (known_drmcgrp_devs[i] != NULL) {
 		ddr->bo_limits_total_allocated =
 		  known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
+
+		ddr->bo_limits_peak_allocated =
+		  known_drmcgrp_devs[i]->bo_limits_peak_allocated_default;
 	}
 
 	return 0;
@@ -131,6 +136,9 @@ static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
 	case DRMCGRP_TYPE_BO_TOTAL:
 		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
 		break;
+	case DRMCGRP_TYPE_BO_PEAK:
+		seq_printf(sf, "%zu\n", ddr->bo_stats_peak_allocated);
+		break;
 	default:
 		seq_puts(sf, "\n");
 		break;
@@ -149,6 +157,9 @@ static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
 	case DRMCGRP_TYPE_BO_TOTAL:
 		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
 		break;
+	case DRMCGRP_TYPE_BO_PEAK:
+		seq_printf(sf, "%zu\n", ddr->bo_limits_peak_allocated);
+		break;
 	default:
 		seq_puts(sf, "\n");
 		break;
@@ -167,6 +178,9 @@ static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
 	case DRMCGRP_TYPE_BO_TOTAL:
 		seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
 		break;
+	case DRMCGRP_TYPE_BO_PEAK:
+		seq_printf(sf, "%zu\n", ddev->bo_limits_peak_allocated_default);
+		break;
 	default:
 		seq_puts(sf, "\n");
 		break;
@@ -182,6 +196,11 @@ static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
 		"Total amount of buffer allocation in bytes for card%d\n",
 		cardNum);
 		break;
+	case DRMCGRP_TYPE_BO_PEAK:
+		seq_printf(sf,
+		"Largest buffer allocation in bytes for card%d\n",
+		cardNum);
+		break;
 	default:
 		seq_puts(sf, "\n");
 		break;
@@ -254,6 +273,10 @@ ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
                                 if (val < 0) continue;
 				ddr->bo_limits_total_allocated = val;
 				break;
+			case DRMCGRP_TYPE_BO_PEAK:
+                                if (val < 0) continue;
+				ddr->bo_limits_peak_allocated = val;
+				break;
 			default:
 				break;
 			}
@@ -300,6 +323,33 @@ struct cftype files[] = {
 		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
 			DRMCGRP_FTYPE_MAX,
 	},
+	{
+		.name = "buffer.peak.stats",
+		.seq_show = drmcgrp_bo_show,
+		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_STATS,
+	},
+	{
+		.name = "buffer.peak.default",
+		.seq_show = drmcgrp_bo_show,
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_DEFAULT,
+	},
+	{
+		.name = "buffer.peak.help",
+		.seq_show = drmcgrp_bo_show,
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_HELP,
+	},
+	{
+		.name = "buffer.peak.max",
+		.write = drmcgrp_bo_limit_write,
+		.seq_show = drmcgrp_bo_show,
+		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
+			DRMCGRP_FTYPE_MAX,
+	},
 	{ }	/* terminate */
 };
 
@@ -323,6 +373,7 @@ int drmcgrp_register_device(struct drm_device *dev)
 
 	ddev->dev = dev;
 	ddev->bo_limits_total_allocated_default = S64_MAX;
+	ddev->bo_limits_peak_allocated_default = SIZE_MAX;
 
 	mutex_init(&ddev->mutex);
 
@@ -393,6 +444,11 @@ bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
 			result = false;
 			break;
 		}
+
+		if (d->bo_limits_peak_allocated < size) {
+			result = false;
+			break;
+		}
 	}
 	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
 
@@ -414,6 +470,9 @@ void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
 		ddr = drmcgrp->dev_resources[devIdx];
 
 		ddr->bo_stats_total_allocated += (s64)size;
+
+		if (ddr->bo_stats_peak_allocated < (size_t)size)
+			ddr->bo_stats_peak_allocated = (size_t)size;
 	}
 	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
 }
-- 
2.21.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]     ` <20190509210410.5471-5-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2019-05-10 12:28       ` Christian König
       [not found]         ` <f63c8d6b-92a4-2977-d062-7e0b7036834e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Christian König @ 2019-05-10 12:28 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tj-DgEjT+Ai2ygdnm+yROfE0A, sunnanyong-hv44wF8Li93QT0dZR+AlfA,
	alexander.deucher-5C7GfCeVMHo,
	brian.welty-ral2JQCrhuEAvxtiuMwx3w

Am 09.05.19 um 23:04 schrieb Kenny Ho:
> The drm resource being measured and limited here is the GEM buffer
> objects.  User applications allocate and free these buffers.  In
> addition, a process can allocate a buffer and share it with another
> process.  The consumer of a shared buffer can also outlive the
> allocator of the buffer.
>
> For the purpose of cgroup accounting and limiting, ownership of the
> buffer is deemed to be the cgroup for which the allocating process
> belongs to.  There is one limit per drm device.
>
> In order to prevent the buffer outliving the cgroup that owns it, a
> process is prevented from importing buffers that are not own by the
> process' cgroup or the ancestors of the process' cgroup.
>
> For this resource, the control files are prefixed with drm.buffer.total.
>
> There are four control file types,
> stats (ro) - display current measured values for a resource
> max (rw) - limits for a resource
> default (ro, root cgroup only) - default values for a resource
> help (ro, root cgroup only) - help string for a resource
>
> Each file is multi-lined with one entry/line per drm device.
>
> Usage examples:
> // set limit for card1 to 1GB
> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>
> // set limit for card0 to 512MB
> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>
> Change-Id: I4c249d06d45ec709d6481d4cbe87c5168545c5d0
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
>   drivers/gpu/drm/drm_gem.c                  |   7 +
>   drivers/gpu/drm/drm_prime.c                |   9 +
>   include/drm/drm_cgroup.h                   |  34 ++-
>   include/drm/drm_gem.h                      |  11 +
>   include/linux/cgroup_drm.h                 |   3 +
>   kernel/cgroup/drm.c                        | 280 +++++++++++++++++++++
>   7 files changed, 346 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 93b2c5a48a71..b4c078b7ad63 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -34,6 +34,7 @@
>   #include <drm/drmP.h>
>   #include <drm/amdgpu_drm.h>
>   #include <drm/drm_cache.h>
> +#include <drm/drm_cgroup.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   #include "amdgpu_amdkfd.h"
> @@ -446,6 +447,9 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   	if (!amdgpu_bo_validate_size(adev, size, bp->domain))
>   		return -ENOMEM;
>   
> +	if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
> +		return -ENOMEM;
> +
>   	*bo_ptr = NULL;
>   
>   	acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 6a80db077dc6..cbd49bf34dcf 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -37,10 +37,12 @@
>   #include <linux/shmem_fs.h>
>   #include <linux/dma-buf.h>
>   #include <linux/mem_encrypt.h>
> +#include <linux/cgroup_drm.h>
>   #include <drm/drmP.h>
>   #include <drm/drm_vma_manager.h>
>   #include <drm/drm_gem.h>
>   #include <drm/drm_print.h>
> +#include <drm/drm_cgroup.h>
>   #include "drm_internal.h"
>   
>   /** @file drm_gem.c
> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>   	obj->handle_count = 0;
>   	obj->size = size;
>   	drm_vma_node_reset(&obj->vma_node);
> +
> +	obj->drmcgrp = get_drmcgrp(current);
> +	drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>   }
>   EXPORT_SYMBOL(drm_gem_private_object_init);
>   
> @@ -804,6 +809,8 @@ drm_gem_object_release(struct drm_gem_object *obj)
>   	if (obj->filp)
>   		fput(obj->filp);
>   
> +	drmcgrp_unchg_bo_alloc(obj->drmcgrp, obj->dev, obj->size);
> +
>   	drm_gem_free_mmap_offset(obj);
>   }
>   EXPORT_SYMBOL(drm_gem_object_release);
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 231e3f6d5f41..faed5611a1c6 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -32,6 +32,7 @@
>   #include <drm/drm_prime.h>
>   #include <drm/drm_gem.h>
>   #include <drm/drmP.h>
> +#include <drm/drm_cgroup.h>
>   
>   #include "drm_internal.h"
>   
> @@ -794,6 +795,7 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>   {
>   	struct dma_buf *dma_buf;
>   	struct drm_gem_object *obj;
> +	struct drmcgrp *drmcgrp = get_drmcgrp(current);
>   	int ret;
>   
>   	dma_buf = dma_buf_get(prime_fd);
> @@ -818,6 +820,13 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>   		goto out_unlock;
>   	}
>   
> +	/* only allow bo from the same cgroup or its ancestor to be imported */
> +	if (drmcgrp != NULL &&
> +			!drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
> +		ret = -EACCES;
> +		goto out_unlock;
> +	}
> +

This will most likely go up in flames.

If I'm not completely mistaken we already use 
drm_gem_prime_fd_to_handle() to exchange handles between different 
cgroups in current container usages.

Christian.

>   	if (obj->dma_buf) {
>   		WARN_ON(obj->dma_buf != dma_buf);
>   	} else {
> diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
> index ddb9eab64360..8711b7c5f7bf 100644
> --- a/include/drm/drm_cgroup.h
> +++ b/include/drm/drm_cgroup.h
> @@ -4,12 +4,20 @@
>   #ifndef __DRM_CGROUP_H__
>   #define __DRM_CGROUP_H__
>   
> +#include <linux/cgroup_drm.h>
> +
>   #ifdef CONFIG_CGROUP_DRM
>   
>   int drmcgrp_register_device(struct drm_device *device);
> -
>   int drmcgrp_unregister_device(struct drm_device *device);
> -
> +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> +		struct drmcgrp *relative);
> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size);
> +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size);
> +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> +		size_t size);
>   #else
>   static inline int drmcgrp_register_device(struct drm_device *device)
>   {
> @@ -20,5 +28,27 @@ static inline int drmcgrp_unregister_device(struct drm_device *device)
>   {
>   	return 0;
>   }
> +
> +static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> +		struct drmcgrp *relative)
> +{
> +	return false;
> +}
> +
> +static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
> +		struct drm_device *dev,	size_t size)
> +{
> +}
> +
> +static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
> +		struct drm_device *dev,	size_t size)
> +{
> +}
> +
> +static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
> +		struct drm_device *dev,	size_t size)
> +{
> +	return true;
> +}
>   #endif /* CONFIG_CGROUP_DRM */
>   #endif /* __DRM_CGROUP_H__ */
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index c95727425284..02854c674b5c 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -272,6 +272,17 @@ struct drm_gem_object {
>   	 *
>   	 */
>   	const struct drm_gem_object_funcs *funcs;
> +
> +	/**
> +	 * @drmcgrp:
> +	 *
> +	 * DRM cgroup this GEM object belongs to.
> +         *
> +         * This is used to track and limit the amount of GEM objects a user
> +         * can allocate.  Since GEM objects can be shared, this is also used
> +         * to ensure GEM objects are only shared within the same cgroup.
> +	 */
> +	struct drmcgrp *drmcgrp;
>   };
>   
>   /**
> diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> index d7ccf434ca6b..fe14ba7bb1cf 100644
> --- a/include/linux/cgroup_drm.h
> +++ b/include/linux/cgroup_drm.h
> @@ -15,6 +15,9 @@
>   
>   struct drmcgrp_device_resource {
>   	/* for per device stats */
> +	s64			bo_stats_total_allocated;
> +
> +	s64			bo_limits_total_allocated;
>   };
>   
>   struct drmcgrp {
> diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> index f9ef4bf042d8..bc3abff09113 100644
> --- a/kernel/cgroup/drm.c
> +++ b/kernel/cgroup/drm.c
> @@ -15,6 +15,22 @@ static DEFINE_MUTEX(drmcgrp_mutex);
>   struct drmcgrp_device {
>   	struct drm_device	*dev;
>   	struct mutex		mutex;
> +
> +	s64			bo_limits_total_allocated_default;
> +};
> +
> +#define DRMCG_CTF_PRIV_SIZE 3
> +#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
> +
> +enum drmcgrp_res_type {
> +	DRMCGRP_TYPE_BO_TOTAL,
> +};
> +
> +enum drmcgrp_file_type {
> +	DRMCGRP_FTYPE_STATS,
> +	DRMCGRP_FTYPE_MAX,
> +	DRMCGRP_FTYPE_DEFAULT,
> +	DRMCGRP_FTYPE_HELP,
>   };
>   
>   /* indexed by drm_minor for access speed */
> @@ -53,6 +69,10 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
>   	}
>   
>   	/* set defaults here */
> +	if (known_drmcgrp_devs[i] != NULL) {
> +		ddr->bo_limits_total_allocated =
> +		  known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
> +	}
>   
>   	return 0;
>   }
> @@ -99,7 +119,187 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
>   	return &drmcgrp->css;
>   }
>   
> +static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddr == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddr == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddev == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
> +		enum drmcgrp_res_type type)
> +{
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf,
> +		"Total amount of buffer allocation in bytes for card%d\n",
> +		cardNum);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +int drmcgrp_bo_show(struct seq_file *sf, void *v)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> +	struct drmcgrp_device_resource *ddr = NULL;
> +	enum drmcgrp_file_type f_type = seq_cft(sf)->
> +		private & DRMCG_CTF_PRIV_MASK;
> +	enum drmcgrp_res_type type = seq_cft(sf)->
> +		private >> DRMCG_CTF_PRIV_SIZE;
> +	struct drmcgrp_device *ddev;
> +	int i;
> +
> +	for (i = 0; i <= max_minor; i++) {
> +		ddr = drmcgrp->dev_resources[i];
> +		ddev = known_drmcgrp_devs[i];
> +
> +		switch (f_type) {
> +		case DRMCGRP_FTYPE_STATS:
> +			drmcgrp_print_stats(ddr, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_MAX:
> +			drmcgrp_print_limits(ddr, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_DEFAULT:
> +			drmcgrp_print_default(ddev, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_HELP:
> +			drmcgrp_print_help(i, sf, type);
> +			break;
> +		default:
> +			seq_puts(sf, "\n");
> +			break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
> +		size_t nbytes, loff_t off)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
> +	enum drmcgrp_res_type type = of_cft(of)->private >> DRMCG_CTF_PRIV_SIZE;
> +	char *cft_name = of_cft(of)->name;
> +	char *limits = strstrip(buf);
> +	struct drmcgrp_device_resource *ddr;
> +	char *sval;
> +	s64 val;
> +	int i = 0;
> +	int rc;
> +
> +	while (i <= max_minor && limits != NULL) {
> +		sval =  strsep(&limits, "\n");
> +		rc = kstrtoll(sval, 0, &val);
> +
> +		if (rc) {
> +			pr_err("drmcgrp: %s: minor %d, err %d. ",
> +				cft_name, i, rc);
> +			pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +			pr_cont("\n");
> +		} else {
> +			ddr = drmcgrp->dev_resources[i];
> +			switch (type) {
> +			case DRMCGRP_TYPE_BO_TOTAL:
> +                                if (val < 0) continue;
> +				ddr->bo_limits_total_allocated = val;
> +				break;
> +			default:
> +				break;
> +			}
> +		}
> +
> +		i++;
> +	}
> +
> +	if (i <= max_minor) {
> +		pr_err("drmcgrp: %s: less entries than # of drm devices. ",
> +				cft_name);
> +		pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +		pr_cont("\n");
> +	}
> +
> +	return nbytes;
> +}
> +
>   struct cftype files[] = {
> +	{
> +		.name = "buffer.total.stats",
> +		.seq_show = drmcgrp_bo_show,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_STATS,
> +	},
> +	{
> +		.name = "buffer.total.default",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_DEFAULT,
> +	},
> +	{
> +		.name = "buffer.total.help",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_HELP,
> +	},
> +	{
> +		.name = "buffer.total.max",
> +		.write = drmcgrp_bo_limit_write,
> +		.seq_show = drmcgrp_bo_show,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_MAX,
> +	},
>   	{ }	/* terminate */
>   };
>   
> @@ -122,6 +322,8 @@ int drmcgrp_register_device(struct drm_device *dev)
>   		return -ENOMEM;
>   
>   	ddev->dev = dev;
> +	ddev->bo_limits_total_allocated_default = S64_MAX;
> +
>   	mutex_init(&ddev->mutex);
>   
>   	mutex_lock(&drmcgrp_mutex);
> @@ -156,3 +358,81 @@ int drmcgrp_unregister_device(struct drm_device *dev)
>   	return 0;
>   }
>   EXPORT_SYMBOL(drmcgrp_unregister_device);
> +
> +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
> +{
> +	for (; self != NULL; self = parent_drmcgrp(self))
> +		if (self == relative)
> +			return true;
> +
> +	return false;
> +}
> +EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
> +
> +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp *drmcgrp = get_drmcgrp(task);
> +	struct drmcgrp_device_resource *ddr;
> +	struct drmcgrp_device_resource *d;
> +	int devIdx = dev->primary->index;
> +	bool result = true;
> +	s64 delta = 0;
> +
> +	if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
> +		return true;
> +
> +	ddr = drmcgrp->dev_resources[devIdx];
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
> +		d = drmcgrp->dev_resources[devIdx];
> +		delta = d->bo_limits_total_allocated -
> +				d->bo_stats_total_allocated;
> +
> +		if (delta <= 0 || size > delta) {
> +			result = false;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +
> +	return result;
> +}
> +EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
> +
> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp_device_resource *ddr;
> +	int devIdx = dev->primary->index;
> +
> +	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> +		return;
> +
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
> +		ddr = drmcgrp->dev_resources[devIdx];
> +
> +		ddr->bo_stats_total_allocated += (s64)size;
> +	}
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +}
> +EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
> +
> +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp_device_resource *ddr;
> +	int devIdx = dev->primary->index;
> +
> +	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> +		return;
> +
> +	ddr = drmcgrp->dev_resources[devIdx];
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
> +		drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
> +			-= (s64)size;
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +}
> +EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 5/5] drm, cgroup: Add peak GEM buffer allocation limit
       [not found]       ` <20190509210410.5471-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2019-05-10 12:29         ` Christian König
  0 siblings, 0 replies; 80+ messages in thread
From: Christian König @ 2019-05-10 12:29 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tj-DgEjT+Ai2ygdnm+yROfE0A, sunnanyong-hv44wF8Li93QT0dZR+AlfA,
	alexander.deucher-5C7GfCeVMHo,
	brian.welty-ral2JQCrhuEAvxtiuMwx3w

Am 09.05.19 um 23:04 schrieb Kenny Ho:
> This new drmcgrp resource limits the largest GEM buffer that can be
> allocated in a cgroup.
>
> Change-Id: I0830d56775568e1cf215b56cc892d5e7945e9f25
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>   include/linux/cgroup_drm.h |  2 ++
>   kernel/cgroup/drm.c        | 59 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 61 insertions(+)
>
> diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> index fe14ba7bb1cf..57c07a148975 100644
> --- a/include/linux/cgroup_drm.h
> +++ b/include/linux/cgroup_drm.h
> @@ -16,8 +16,10 @@
>   struct drmcgrp_device_resource {
>   	/* for per device stats */
>   	s64			bo_stats_total_allocated;
> +	size_t			bo_stats_peak_allocated;
>   
>   	s64			bo_limits_total_allocated;
> +	size_t			bo_limits_peak_allocated;

Why s64 for the total limit and size_t for the peak allocation?

Christian.

>   };
>   
>   struct drmcgrp {
> diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> index bc3abff09113..5c7e1b8059ce 100644
> --- a/kernel/cgroup/drm.c
> +++ b/kernel/cgroup/drm.c
> @@ -17,6 +17,7 @@ struct drmcgrp_device {
>   	struct mutex		mutex;
>   
>   	s64			bo_limits_total_allocated_default;
> +	size_t			bo_limits_peak_allocated_default;
>   };
>   
>   #define DRMCG_CTF_PRIV_SIZE 3
> @@ -24,6 +25,7 @@ struct drmcgrp_device {
>   
>   enum drmcgrp_res_type {
>   	DRMCGRP_TYPE_BO_TOTAL,
> +	DRMCGRP_TYPE_BO_PEAK,
>   };
>   
>   enum drmcgrp_file_type {
> @@ -72,6 +74,9 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
>   	if (known_drmcgrp_devs[i] != NULL) {
>   		ddr->bo_limits_total_allocated =
>   		  known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
> +
> +		ddr->bo_limits_peak_allocated =
> +		  known_drmcgrp_devs[i]->bo_limits_peak_allocated_default;
>   	}
>   
>   	return 0;
> @@ -131,6 +136,9 @@ static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
>   	case DRMCGRP_TYPE_BO_TOTAL:
>   		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
>   		break;
> +	case DRMCGRP_TYPE_BO_PEAK:
> +		seq_printf(sf, "%zu\n", ddr->bo_stats_peak_allocated);
> +		break;
>   	default:
>   		seq_puts(sf, "\n");
>   		break;
> @@ -149,6 +157,9 @@ static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
>   	case DRMCGRP_TYPE_BO_TOTAL:
>   		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
>   		break;
> +	case DRMCGRP_TYPE_BO_PEAK:
> +		seq_printf(sf, "%zu\n", ddr->bo_limits_peak_allocated);
> +		break;
>   	default:
>   		seq_puts(sf, "\n");
>   		break;
> @@ -167,6 +178,9 @@ static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
>   	case DRMCGRP_TYPE_BO_TOTAL:
>   		seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
>   		break;
> +	case DRMCGRP_TYPE_BO_PEAK:
> +		seq_printf(sf, "%zu\n", ddev->bo_limits_peak_allocated_default);
> +		break;
>   	default:
>   		seq_puts(sf, "\n");
>   		break;
> @@ -182,6 +196,11 @@ static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
>   		"Total amount of buffer allocation in bytes for card%d\n",
>   		cardNum);
>   		break;
> +	case DRMCGRP_TYPE_BO_PEAK:
> +		seq_printf(sf,
> +		"Largest buffer allocation in bytes for card%d\n",
> +		cardNum);
> +		break;
>   	default:
>   		seq_puts(sf, "\n");
>   		break;
> @@ -254,6 +273,10 @@ ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
>                                   if (val < 0) continue;
>   				ddr->bo_limits_total_allocated = val;
>   				break;
> +			case DRMCGRP_TYPE_BO_PEAK:
> +                                if (val < 0) continue;
> +				ddr->bo_limits_peak_allocated = val;
> +				break;
>   			default:
>   				break;
>   			}
> @@ -300,6 +323,33 @@ struct cftype files[] = {
>   		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
>   			DRMCGRP_FTYPE_MAX,
>   	},
> +	{
> +		.name = "buffer.peak.stats",
> +		.seq_show = drmcgrp_bo_show,
> +		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_STATS,
> +	},
> +	{
> +		.name = "buffer.peak.default",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_DEFAULT,
> +	},
> +	{
> +		.name = "buffer.peak.help",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_HELP,
> +	},
> +	{
> +		.name = "buffer.peak.max",
> +		.write = drmcgrp_bo_limit_write,
> +		.seq_show = drmcgrp_bo_show,
> +		.private = (DRMCGRP_TYPE_BO_PEAK << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_MAX,
> +	},
>   	{ }	/* terminate */
>   };
>   
> @@ -323,6 +373,7 @@ int drmcgrp_register_device(struct drm_device *dev)
>   
>   	ddev->dev = dev;
>   	ddev->bo_limits_total_allocated_default = S64_MAX;
> +	ddev->bo_limits_peak_allocated_default = SIZE_MAX;
>   
>   	mutex_init(&ddev->mutex);
>   
> @@ -393,6 +444,11 @@ bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
>   			result = false;
>   			break;
>   		}
> +
> +		if (d->bo_limits_peak_allocated < size) {
> +			result = false;
> +			break;
> +		}
>   	}
>   	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
>   
> @@ -414,6 +470,9 @@ void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
>   		ddr = drmcgrp->dev_resources[devIdx];
>   
>   		ddr->bo_stats_total_allocated += (s64)size;
> +
> +		if (ddr->bo_stats_peak_allocated < (size_t)size)
> +			ddr->bo_stats_peak_allocated = (size_t)size;
>   	}
>   	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
>   }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem
       [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
                       ` (2 preceding siblings ...)
  2019-05-09 21:04     ` [RFC PATCH v2 5/5] drm, cgroup: Add peak GEM buffer allocation limit Kenny Ho
@ 2019-05-10 12:31     ` Christian König
  2019-05-10 15:07         ` Kenny Ho
  3 siblings, 1 reply; 80+ messages in thread
From: Christian König @ 2019-05-10 12:31 UTC (permalink / raw)
  To: Kenny Ho, y2kenny-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tj-DgEjT+Ai2ygdnm+yROfE0A, sunnanyong-hv44wF8Li93QT0dZR+AlfA,
	alexander.deucher-5C7GfCeVMHo,
	brian.welty-ral2JQCrhuEAvxtiuMwx3w

That looks better than I thought it would be.

I think it is a good approach to try to add a global limit first and 
when that's working go ahead with limiting device specific resources.

The only major issue I can see is on patch #4, see there for further 
details.

Christian.

Am 09.05.19 um 23:04 schrieb Kenny Ho:
> This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.
>
> Usage examples:
> // set limit for card1 to 1GB
> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>
> // set limit for card0 to 512MB
> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>
>
> v2:
> * Removed the vendoring concepts
> * Add limit to total buffer allocation
> * Add limit to the maximum size of a buffer allocation
>
> TODO: process migration
> TODO: documentations
>
> [a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
>
> v1: cover letter
>
> The purpose of this patch series is to start a discussion for a generic cgroup
> controller for the drm subsystem.  The design proposed here is a very early one.
> We are hoping to engage the community as we develop the idea.
>
>
> Backgrounds
> ==========
> Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
> tasks, and all their future children, into hierarchical groups with specialized
> behaviour, such as accounting/limiting the resources which processes in a cgroup
> can access[1].  Weights, limits, protections, allocations are the main resource
> distribution models.  Existing cgroup controllers includes cpu, memory, io,
> rdma, and more.  cgroup is one of the foundational technologies that enables the
> popular container application deployment and management method.
>
> Direct Rendering Manager/drm contains code intended to support the needs of
> complex graphics devices. Graphics drivers in the kernel may make use of DRM
> functions to make tasks like memory management, interrupt handling and DMA
> easier, and provide a uniform interface to applications.  The DRM has also
> developed beyond traditional graphics applications to support compute/GPGPU
> applications.
>
>
> Motivations
> =========
> As GPU grow beyond the realm of desktop/workstation graphics into areas like
> data center clusters and IoT, there are increasing needs to monitor and regulate
> GPU as a resource like cpu, memory and io.
>
> Matt Roper from Intel began working on similar idea in early 2018 [2] for the
> purpose of managing GPU priority using the cgroup hierarchy.  While that
> particular use case may not warrant a standalone drm cgroup controller, there
> are other use cases where having one can be useful [3].  Monitoring GPU
> resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
> (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
> sysadmins get a better understanding of the applications usage profile.  Further
> usage regulations of the aforementioned resources can also help sysadmins
> optimize workload deployment on limited GPU resources.
>
> With the increased importance of machine learning, data science and other
> cloud-based applications, GPUs are already in production use in data centers
> today [5,6,7].  Existing GPU resource management is very course grain, however,
> as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
> alternative is to use GPU virtualization (with or without SRIOV) but it
> generally acts on the entire GPU instead of the specific resources in a GPU.
> With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
> resource management (in addition to what may be available via GPU
> virtualization.)
>
> In addition to production use, the DRM cgroup can also help with testing
> graphics application robustness by providing a mean to artificially limit DRM
> resources availble to the applications.
>
> Challenges
> ========
> While there are common infrastructure in DRM that is shared across many vendors
> (the scheduler [4] for example), there are also aspects of DRM that are vendor
> specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
> handle different kinds of cgroup controller.
>
> Resources for DRM are also often device (GPU) specific instead of system
> specific and a system may contain more than one GPU.  For this, we borrowed some
> of the ideas from RDMA cgroup controller.
>
> Approach
> =======
> To experiment with the idea of a DRM cgroup, we would like to start with basic
> accounting and statistics, then continue to iterate and add regulating
> mechanisms into the driver.
>
> [1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
> [2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> [3] https://www.spinics.net/lists/cgroups/msg20720.html
> [4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
> [5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
> [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
> [7] https://github.com/RadeonOpenCompute/k8s-device-plugin
> [8] https://github.com/kubernetes/kubernetes/issues/52757
>
> Kenny Ho (5):
>    cgroup: Introduce cgroup for drm subsystem
>    cgroup: Add mechanism to register DRM devices
>    drm/amdgpu: Register AMD devices for DRM cgroup
>    drm, cgroup: Add total GEM buffer allocation limit
>    drm, cgroup: Add peak GEM buffer allocation limit
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
>   drivers/gpu/drm/drm_gem.c                  |   7 +
>   drivers/gpu/drm/drm_prime.c                |   9 +
>   include/drm/drm_cgroup.h                   |  54 +++
>   include/drm/drm_gem.h                      |  11 +
>   include/linux/cgroup_drm.h                 |  47 ++
>   include/linux/cgroup_subsys.h              |   4 +
>   init/Kconfig                               |   5 +
>   kernel/cgroup/Makefile                     |   1 +
>   kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
>   11 files changed, 643 insertions(+)
>   create mode 100644 include/drm/drm_cgroup.h
>   create mode 100644 include/linux/cgroup_drm.h
>   create mode 100644 kernel/cgroup/drm.c
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]         ` <f63c8d6b-92a4-2977-d062-7e0b7036834e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-05-10 14:57             ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 14:57 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Kenny Ho, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Fri, May 10, 2019 at 8:28 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> > +     /* only allow bo from the same cgroup or its ancestor to be imported */
> > +     if (drmcgrp != NULL &&
> > +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
> > +             ret = -EACCES;
> > +             goto out_unlock;
> > +     }
> > +
>
> This will most likely go up in flames.
>
> If I'm not completely mistaken we already use
> drm_gem_prime_fd_to_handle() to exchange handles between different
> cgroups in current container usages.
This is something that I am interested in getting more details from
the broader community because the details affect how likely this will
go up in flames ;).  Note that this check does not block sharing of
handles from cgroup parent to children in the hierarchy, nor does it
blocks sharing of handles within a cgroup.

I am interested to find out, when existing apps share handles between
containers, if there are any expectations on resource management.
Since there are no drm cgroup for current container usage, I expect
the answer to be no.  In this case, the drm cgroup controller can be
disabled on its own (in the context of cgroup-v2's unified hierarchy),
or the process can remain at the root for the drm cgroup hierarchy (in
the context of cgroup-v1.)  If I understand the cgroup api correctly,
that means all process would be part of the root cgroup as far as the
drm controller is concerned and this block will not come into effect.
I have verified that this is indeed the current default behaviour of a
container runtime (runc, which is used by docker, podman and others.)
The new drm cgroup controller is simply ignored and all processes
remain at the root of the hierarchy (since there are no other
cgroups.)  I plan to make contributions to runc (so folks can actually
use this features with docker/podman/k8s, etc.) once things stabilized
on the kernel side.

On the other hand, if there are expectations for resource management
between containers, I would like to know who is the expected manager
and how does it fit into the concept of container (which enforce some
level of isolation.)  One possible manager may be the display server.
But as long as the display server is in a parent cgroup of the apps'
cgroup, the apps can still import handles from the display server
under the current implementation.  My understanding is that this is
most likely the case, with the display server simply sitting at the
default/root cgroup.  But I certainly want to hear more about other
use cases (for example, is running multiple display servers on a
single host a realistic possibility?  Are there people running
multiple display servers inside peer containers?  If so, how do they
coordinate resources?)

I should probably summarize some of these into the commit message.

Regards,
Kenny



> Christian.
>
> >       if (obj->dma_buf) {
> >               WARN_ON(obj->dma_buf != dma_buf);
> >       } else {
> > diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
> > index ddb9eab64360..8711b7c5f7bf 100644
> > --- a/include/drm/drm_cgroup.h
> > +++ b/include/drm/drm_cgroup.h
> > @@ -4,12 +4,20 @@
> >   #ifndef __DRM_CGROUP_H__
> >   #define __DRM_CGROUP_H__
> >
> > +#include <linux/cgroup_drm.h>
> > +
> >   #ifdef CONFIG_CGROUP_DRM
> >
> >   int drmcgrp_register_device(struct drm_device *device);
> > -
> >   int drmcgrp_unregister_device(struct drm_device *device);
> > -
> > +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> > +             struct drmcgrp *relative);
> > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size);
> > +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size);
> > +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> > +             size_t size);
> >   #else
> >   static inline int drmcgrp_register_device(struct drm_device *device)
> >   {
> > @@ -20,5 +28,27 @@ static inline int drmcgrp_unregister_device(struct drm_device *device)
> >   {
> >       return 0;
> >   }
> > +
> > +static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> > +             struct drmcgrp *relative)
> > +{
> > +     return false;
> > +}
> > +
> > +static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
> > +             struct drm_device *dev, size_t size)
> > +{
> > +}
> > +
> > +static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
> > +             struct drm_device *dev, size_t size)
> > +{
> > +}
> > +
> > +static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
> > +             struct drm_device *dev, size_t size)
> > +{
> > +     return true;
> > +}
> >   #endif /* CONFIG_CGROUP_DRM */
> >   #endif /* __DRM_CGROUP_H__ */
> > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> > index c95727425284..02854c674b5c 100644
> > --- a/include/drm/drm_gem.h
> > +++ b/include/drm/drm_gem.h
> > @@ -272,6 +272,17 @@ struct drm_gem_object {
> >        *
> >        */
> >       const struct drm_gem_object_funcs *funcs;
> > +
> > +     /**
> > +      * @drmcgrp:
> > +      *
> > +      * DRM cgroup this GEM object belongs to.
> > +         *
> > +         * This is used to track and limit the amount of GEM objects a user
> > +         * can allocate.  Since GEM objects can be shared, this is also used
> > +         * to ensure GEM objects are only shared within the same cgroup.
> > +      */
> > +     struct drmcgrp *drmcgrp;
> >   };
> >
> >   /**
> > diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> > index d7ccf434ca6b..fe14ba7bb1cf 100644
> > --- a/include/linux/cgroup_drm.h
> > +++ b/include/linux/cgroup_drm.h
> > @@ -15,6 +15,9 @@
> >
> >   struct drmcgrp_device_resource {
> >       /* for per device stats */
> > +     s64                     bo_stats_total_allocated;
> > +
> > +     s64                     bo_limits_total_allocated;
> >   };
> >
> >   struct drmcgrp {
> > diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> > index f9ef4bf042d8..bc3abff09113 100644
> > --- a/kernel/cgroup/drm.c
> > +++ b/kernel/cgroup/drm.c
> > @@ -15,6 +15,22 @@ static DEFINE_MUTEX(drmcgrp_mutex);
> >   struct drmcgrp_device {
> >       struct drm_device       *dev;
> >       struct mutex            mutex;
> > +
> > +     s64                     bo_limits_total_allocated_default;
> > +};
> > +
> > +#define DRMCG_CTF_PRIV_SIZE 3
> > +#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
> > +
> > +enum drmcgrp_res_type {
> > +     DRMCGRP_TYPE_BO_TOTAL,
> > +};
> > +
> > +enum drmcgrp_file_type {
> > +     DRMCGRP_FTYPE_STATS,
> > +     DRMCGRP_FTYPE_MAX,
> > +     DRMCGRP_FTYPE_DEFAULT,
> > +     DRMCGRP_FTYPE_HELP,
> >   };
> >
> >   /* indexed by drm_minor for access speed */
> > @@ -53,6 +69,10 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
> >       }
> >
> >       /* set defaults here */
> > +     if (known_drmcgrp_devs[i] != NULL) {
> > +             ddr->bo_limits_total_allocated =
> > +               known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
> > +     }
> >
> >       return 0;
> >   }
> > @@ -99,7 +119,187 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
> >       return &drmcgrp->css;
> >   }
> >
> > +static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
> > +             struct seq_file *sf, enum drmcgrp_res_type type)
> > +{
> > +     if (ddr == NULL) {
> > +             seq_puts(sf, "\n");
> > +             return;
> > +     }
> > +
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
> > +             struct seq_file *sf, enum drmcgrp_res_type type)
> > +{
> > +     if (ddr == NULL) {
> > +             seq_puts(sf, "\n");
> > +             return;
> > +     }
> > +
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
> > +             struct seq_file *sf, enum drmcgrp_res_type type)
> > +{
> > +     if (ddev == NULL) {
> > +             seq_puts(sf, "\n");
> > +             return;
> > +     }
> > +
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
> > +             enum drmcgrp_res_type type)
> > +{
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf,
> > +             "Total amount of buffer allocation in bytes for card%d\n",
> > +             cardNum);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +int drmcgrp_bo_show(struct seq_file *sf, void *v)
> > +{
> > +     struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> > +     struct drmcgrp_device_resource *ddr = NULL;
> > +     enum drmcgrp_file_type f_type = seq_cft(sf)->
> > +             private & DRMCG_CTF_PRIV_MASK;
> > +     enum drmcgrp_res_type type = seq_cft(sf)->
> > +             private >> DRMCG_CTF_PRIV_SIZE;
> > +     struct drmcgrp_device *ddev;
> > +     int i;
> > +
> > +     for (i = 0; i <= max_minor; i++) {
> > +             ddr = drmcgrp->dev_resources[i];
> > +             ddev = known_drmcgrp_devs[i];
> > +
> > +             switch (f_type) {
> > +             case DRMCGRP_FTYPE_STATS:
> > +                     drmcgrp_print_stats(ddr, sf, type);
> > +                     break;
> > +             case DRMCGRP_FTYPE_MAX:
> > +                     drmcgrp_print_limits(ddr, sf, type);
> > +                     break;
> > +             case DRMCGRP_FTYPE_DEFAULT:
> > +                     drmcgrp_print_default(ddev, sf, type);
> > +                     break;
> > +             case DRMCGRP_FTYPE_HELP:
> > +                     drmcgrp_print_help(i, sf, type);
> > +                     break;
> > +             default:
> > +                     seq_puts(sf, "\n");
> > +                     break;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
> > +             size_t nbytes, loff_t off)
> > +{
> > +     struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
> > +     enum drmcgrp_res_type type = of_cft(of)->private >> DRMCG_CTF_PRIV_SIZE;
> > +     char *cft_name = of_cft(of)->name;
> > +     char *limits = strstrip(buf);
> > +     struct drmcgrp_device_resource *ddr;
> > +     char *sval;
> > +     s64 val;
> > +     int i = 0;
> > +     int rc;
> > +
> > +     while (i <= max_minor && limits != NULL) {
> > +             sval =  strsep(&limits, "\n");
> > +             rc = kstrtoll(sval, 0, &val);
> > +
> > +             if (rc) {
> > +                     pr_err("drmcgrp: %s: minor %d, err %d. ",
> > +                             cft_name, i, rc);
> > +                     pr_cont_cgroup_name(drmcgrp->css.cgroup);
> > +                     pr_cont("\n");
> > +             } else {
> > +                     ddr = drmcgrp->dev_resources[i];
> > +                     switch (type) {
> > +                     case DRMCGRP_TYPE_BO_TOTAL:
> > +                                if (val < 0) continue;
> > +                             ddr->bo_limits_total_allocated = val;
> > +                             break;
> > +                     default:
> > +                             break;
> > +                     }
> > +             }
> > +
> > +             i++;
> > +     }
> > +
> > +     if (i <= max_minor) {
> > +             pr_err("drmcgrp: %s: less entries than # of drm devices. ",
> > +                             cft_name);
> > +             pr_cont_cgroup_name(drmcgrp->css.cgroup);
> > +             pr_cont("\n");
> > +     }
> > +
> > +     return nbytes;
> > +}
> > +
> >   struct cftype files[] = {
> > +     {
> > +             .name = "buffer.total.stats",
> > +             .seq_show = drmcgrp_bo_show,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_STATS,
> > +     },
> > +     {
> > +             .name = "buffer.total.default",
> > +             .seq_show = drmcgrp_bo_show,
> > +             .flags = CFTYPE_ONLY_ON_ROOT,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_DEFAULT,
> > +     },
> > +     {
> > +             .name = "buffer.total.help",
> > +             .seq_show = drmcgrp_bo_show,
> > +             .flags = CFTYPE_ONLY_ON_ROOT,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_HELP,
> > +     },
> > +     {
> > +             .name = "buffer.total.max",
> > +             .write = drmcgrp_bo_limit_write,
> > +             .seq_show = drmcgrp_bo_show,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_MAX,
> > +     },
> >       { }     /* terminate */
> >   };
> >
> > @@ -122,6 +322,8 @@ int drmcgrp_register_device(struct drm_device *dev)
> >               return -ENOMEM;
> >
> >       ddev->dev = dev;
> > +     ddev->bo_limits_total_allocated_default = S64_MAX;
> > +
> >       mutex_init(&ddev->mutex);
> >
> >       mutex_lock(&drmcgrp_mutex);
> > @@ -156,3 +358,81 @@ int drmcgrp_unregister_device(struct drm_device *dev)
> >       return 0;
> >   }
> >   EXPORT_SYMBOL(drmcgrp_unregister_device);
> > +
> > +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
> > +{
> > +     for (; self != NULL; self = parent_drmcgrp(self))
> > +             if (self == relative)
> > +                     return true;
> > +
> > +     return false;
> > +}
> > +EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
> > +
> > +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> > +             size_t size)
> > +{
> > +     struct drmcgrp *drmcgrp = get_drmcgrp(task);
> > +     struct drmcgrp_device_resource *ddr;
> > +     struct drmcgrp_device_resource *d;
> > +     int devIdx = dev->primary->index;
> > +     bool result = true;
> > +     s64 delta = 0;
> > +
> > +     if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
> > +             return true;
> > +
> > +     ddr = drmcgrp->dev_resources[devIdx];
> > +     mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> > +     for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
> > +             d = drmcgrp->dev_resources[devIdx];
> > +             delta = d->bo_limits_total_allocated -
> > +                             d->bo_stats_total_allocated;
> > +
> > +             if (delta <= 0 || size > delta) {
> > +                     result = false;
> > +                     break;
> > +             }
> > +     }
> > +     mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> > +
> > +     return result;
> > +}
> > +EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
> > +
> > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size)
> > +{
> > +     struct drmcgrp_device_resource *ddr;
> > +     int devIdx = dev->primary->index;
> > +
> > +     if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> > +             return;
> > +
> > +     mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> > +     for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
> > +             ddr = drmcgrp->dev_resources[devIdx];
> > +
> > +             ddr->bo_stats_total_allocated += (s64)size;
> > +     }
> > +     mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> > +}
> > +EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
> > +
> > +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size)
> > +{
> > +     struct drmcgrp_device_resource *ddr;
> > +     int devIdx = dev->primary->index;
> > +
> > +     if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> > +             return;
> > +
> > +     ddr = drmcgrp->dev_resources[devIdx];
> > +     mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> > +     for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
> > +             drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
> > +                     -= (s64)size;
> > +     mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> > +}
> > +EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-10 14:57             ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 14:57 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Kenny Ho, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Fri, May 10, 2019 at 8:28 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> > +     /* only allow bo from the same cgroup or its ancestor to be imported */
> > +     if (drmcgrp != NULL &&
> > +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
> > +             ret = -EACCES;
> > +             goto out_unlock;
> > +     }
> > +
>
> This will most likely go up in flames.
>
> If I'm not completely mistaken we already use
> drm_gem_prime_fd_to_handle() to exchange handles between different
> cgroups in current container usages.
This is something that I am interested in getting more details from
the broader community because the details affect how likely this will
go up in flames ;).  Note that this check does not block sharing of
handles from cgroup parent to children in the hierarchy, nor does it
blocks sharing of handles within a cgroup.

I am interested to find out, when existing apps share handles between
containers, if there are any expectations on resource management.
Since there are no drm cgroup for current container usage, I expect
the answer to be no.  In this case, the drm cgroup controller can be
disabled on its own (in the context of cgroup-v2's unified hierarchy),
or the process can remain at the root for the drm cgroup hierarchy (in
the context of cgroup-v1.)  If I understand the cgroup api correctly,
that means all process would be part of the root cgroup as far as the
drm controller is concerned and this block will not come into effect.
I have verified that this is indeed the current default behaviour of a
container runtime (runc, which is used by docker, podman and others.)
The new drm cgroup controller is simply ignored and all processes
remain at the root of the hierarchy (since there are no other
cgroups.)  I plan to make contributions to runc (so folks can actually
use this features with docker/podman/k8s, etc.) once things stabilized
on the kernel side.

On the other hand, if there are expectations for resource management
between containers, I would like to know who is the expected manager
and how does it fit into the concept of container (which enforce some
level of isolation.)  One possible manager may be the display server.
But as long as the display server is in a parent cgroup of the apps'
cgroup, the apps can still import handles from the display server
under the current implementation.  My understanding is that this is
most likely the case, with the display server simply sitting at the
default/root cgroup.  But I certainly want to hear more about other
use cases (for example, is running multiple display servers on a
single host a realistic possibility?  Are there people running
multiple display servers inside peer containers?  If so, how do they
coordinate resources?)

I should probably summarize some of these into the commit message.

Regards,
Kenny



> Christian.
>
> >       if (obj->dma_buf) {
> >               WARN_ON(obj->dma_buf != dma_buf);
> >       } else {
> > diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
> > index ddb9eab64360..8711b7c5f7bf 100644
> > --- a/include/drm/drm_cgroup.h
> > +++ b/include/drm/drm_cgroup.h
> > @@ -4,12 +4,20 @@
> >   #ifndef __DRM_CGROUP_H__
> >   #define __DRM_CGROUP_H__
> >
> > +#include <linux/cgroup_drm.h>
> > +
> >   #ifdef CONFIG_CGROUP_DRM
> >
> >   int drmcgrp_register_device(struct drm_device *device);
> > -
> >   int drmcgrp_unregister_device(struct drm_device *device);
> > -
> > +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> > +             struct drmcgrp *relative);
> > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size);
> > +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size);
> > +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> > +             size_t size);
> >   #else
> >   static inline int drmcgrp_register_device(struct drm_device *device)
> >   {
> > @@ -20,5 +28,27 @@ static inline int drmcgrp_unregister_device(struct drm_device *device)
> >   {
> >       return 0;
> >   }
> > +
> > +static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> > +             struct drmcgrp *relative)
> > +{
> > +     return false;
> > +}
> > +
> > +static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
> > +             struct drm_device *dev, size_t size)
> > +{
> > +}
> > +
> > +static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
> > +             struct drm_device *dev, size_t size)
> > +{
> > +}
> > +
> > +static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
> > +             struct drm_device *dev, size_t size)
> > +{
> > +     return true;
> > +}
> >   #endif /* CONFIG_CGROUP_DRM */
> >   #endif /* __DRM_CGROUP_H__ */
> > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> > index c95727425284..02854c674b5c 100644
> > --- a/include/drm/drm_gem.h
> > +++ b/include/drm/drm_gem.h
> > @@ -272,6 +272,17 @@ struct drm_gem_object {
> >        *
> >        */
> >       const struct drm_gem_object_funcs *funcs;
> > +
> > +     /**
> > +      * @drmcgrp:
> > +      *
> > +      * DRM cgroup this GEM object belongs to.
> > +         *
> > +         * This is used to track and limit the amount of GEM objects a user
> > +         * can allocate.  Since GEM objects can be shared, this is also used
> > +         * to ensure GEM objects are only shared within the same cgroup.
> > +      */
> > +     struct drmcgrp *drmcgrp;
> >   };
> >
> >   /**
> > diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> > index d7ccf434ca6b..fe14ba7bb1cf 100644
> > --- a/include/linux/cgroup_drm.h
> > +++ b/include/linux/cgroup_drm.h
> > @@ -15,6 +15,9 @@
> >
> >   struct drmcgrp_device_resource {
> >       /* for per device stats */
> > +     s64                     bo_stats_total_allocated;
> > +
> > +     s64                     bo_limits_total_allocated;
> >   };
> >
> >   struct drmcgrp {
> > diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> > index f9ef4bf042d8..bc3abff09113 100644
> > --- a/kernel/cgroup/drm.c
> > +++ b/kernel/cgroup/drm.c
> > @@ -15,6 +15,22 @@ static DEFINE_MUTEX(drmcgrp_mutex);
> >   struct drmcgrp_device {
> >       struct drm_device       *dev;
> >       struct mutex            mutex;
> > +
> > +     s64                     bo_limits_total_allocated_default;
> > +};
> > +
> > +#define DRMCG_CTF_PRIV_SIZE 3
> > +#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
> > +
> > +enum drmcgrp_res_type {
> > +     DRMCGRP_TYPE_BO_TOTAL,
> > +};
> > +
> > +enum drmcgrp_file_type {
> > +     DRMCGRP_FTYPE_STATS,
> > +     DRMCGRP_FTYPE_MAX,
> > +     DRMCGRP_FTYPE_DEFAULT,
> > +     DRMCGRP_FTYPE_HELP,
> >   };
> >
> >   /* indexed by drm_minor for access speed */
> > @@ -53,6 +69,10 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
> >       }
> >
> >       /* set defaults here */
> > +     if (known_drmcgrp_devs[i] != NULL) {
> > +             ddr->bo_limits_total_allocated =
> > +               known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
> > +     }
> >
> >       return 0;
> >   }
> > @@ -99,7 +119,187 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
> >       return &drmcgrp->css;
> >   }
> >
> > +static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
> > +             struct seq_file *sf, enum drmcgrp_res_type type)
> > +{
> > +     if (ddr == NULL) {
> > +             seq_puts(sf, "\n");
> > +             return;
> > +     }
> > +
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
> > +             struct seq_file *sf, enum drmcgrp_res_type type)
> > +{
> > +     if (ddr == NULL) {
> > +             seq_puts(sf, "\n");
> > +             return;
> > +     }
> > +
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
> > +             struct seq_file *sf, enum drmcgrp_res_type type)
> > +{
> > +     if (ddev == NULL) {
> > +             seq_puts(sf, "\n");
> > +             return;
> > +     }
> > +
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
> > +             enum drmcgrp_res_type type)
> > +{
> > +     switch (type) {
> > +     case DRMCGRP_TYPE_BO_TOTAL:
> > +             seq_printf(sf,
> > +             "Total amount of buffer allocation in bytes for card%d\n",
> > +             cardNum);
> > +             break;
> > +     default:
> > +             seq_puts(sf, "\n");
> > +             break;
> > +     }
> > +}
> > +
> > +int drmcgrp_bo_show(struct seq_file *sf, void *v)
> > +{
> > +     struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> > +     struct drmcgrp_device_resource *ddr = NULL;
> > +     enum drmcgrp_file_type f_type = seq_cft(sf)->
> > +             private & DRMCG_CTF_PRIV_MASK;
> > +     enum drmcgrp_res_type type = seq_cft(sf)->
> > +             private >> DRMCG_CTF_PRIV_SIZE;
> > +     struct drmcgrp_device *ddev;
> > +     int i;
> > +
> > +     for (i = 0; i <= max_minor; i++) {
> > +             ddr = drmcgrp->dev_resources[i];
> > +             ddev = known_drmcgrp_devs[i];
> > +
> > +             switch (f_type) {
> > +             case DRMCGRP_FTYPE_STATS:
> > +                     drmcgrp_print_stats(ddr, sf, type);
> > +                     break;
> > +             case DRMCGRP_FTYPE_MAX:
> > +                     drmcgrp_print_limits(ddr, sf, type);
> > +                     break;
> > +             case DRMCGRP_FTYPE_DEFAULT:
> > +                     drmcgrp_print_default(ddev, sf, type);
> > +                     break;
> > +             case DRMCGRP_FTYPE_HELP:
> > +                     drmcgrp_print_help(i, sf, type);
> > +                     break;
> > +             default:
> > +                     seq_puts(sf, "\n");
> > +                     break;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
> > +             size_t nbytes, loff_t off)
> > +{
> > +     struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
> > +     enum drmcgrp_res_type type = of_cft(of)->private >> DRMCG_CTF_PRIV_SIZE;
> > +     char *cft_name = of_cft(of)->name;
> > +     char *limits = strstrip(buf);
> > +     struct drmcgrp_device_resource *ddr;
> > +     char *sval;
> > +     s64 val;
> > +     int i = 0;
> > +     int rc;
> > +
> > +     while (i <= max_minor && limits != NULL) {
> > +             sval =  strsep(&limits, "\n");
> > +             rc = kstrtoll(sval, 0, &val);
> > +
> > +             if (rc) {
> > +                     pr_err("drmcgrp: %s: minor %d, err %d. ",
> > +                             cft_name, i, rc);
> > +                     pr_cont_cgroup_name(drmcgrp->css.cgroup);
> > +                     pr_cont("\n");
> > +             } else {
> > +                     ddr = drmcgrp->dev_resources[i];
> > +                     switch (type) {
> > +                     case DRMCGRP_TYPE_BO_TOTAL:
> > +                                if (val < 0) continue;
> > +                             ddr->bo_limits_total_allocated = val;
> > +                             break;
> > +                     default:
> > +                             break;
> > +                     }
> > +             }
> > +
> > +             i++;
> > +     }
> > +
> > +     if (i <= max_minor) {
> > +             pr_err("drmcgrp: %s: less entries than # of drm devices. ",
> > +                             cft_name);
> > +             pr_cont_cgroup_name(drmcgrp->css.cgroup);
> > +             pr_cont("\n");
> > +     }
> > +
> > +     return nbytes;
> > +}
> > +
> >   struct cftype files[] = {
> > +     {
> > +             .name = "buffer.total.stats",
> > +             .seq_show = drmcgrp_bo_show,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_STATS,
> > +     },
> > +     {
> > +             .name = "buffer.total.default",
> > +             .seq_show = drmcgrp_bo_show,
> > +             .flags = CFTYPE_ONLY_ON_ROOT,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_DEFAULT,
> > +     },
> > +     {
> > +             .name = "buffer.total.help",
> > +             .seq_show = drmcgrp_bo_show,
> > +             .flags = CFTYPE_ONLY_ON_ROOT,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_HELP,
> > +     },
> > +     {
> > +             .name = "buffer.total.max",
> > +             .write = drmcgrp_bo_limit_write,
> > +             .seq_show = drmcgrp_bo_show,
> > +             .private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> > +                     DRMCGRP_FTYPE_MAX,
> > +     },
> >       { }     /* terminate */
> >   };
> >
> > @@ -122,6 +322,8 @@ int drmcgrp_register_device(struct drm_device *dev)
> >               return -ENOMEM;
> >
> >       ddev->dev = dev;
> > +     ddev->bo_limits_total_allocated_default = S64_MAX;
> > +
> >       mutex_init(&ddev->mutex);
> >
> >       mutex_lock(&drmcgrp_mutex);
> > @@ -156,3 +358,81 @@ int drmcgrp_unregister_device(struct drm_device *dev)
> >       return 0;
> >   }
> >   EXPORT_SYMBOL(drmcgrp_unregister_device);
> > +
> > +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
> > +{
> > +     for (; self != NULL; self = parent_drmcgrp(self))
> > +             if (self == relative)
> > +                     return true;
> > +
> > +     return false;
> > +}
> > +EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
> > +
> > +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> > +             size_t size)
> > +{
> > +     struct drmcgrp *drmcgrp = get_drmcgrp(task);
> > +     struct drmcgrp_device_resource *ddr;
> > +     struct drmcgrp_device_resource *d;
> > +     int devIdx = dev->primary->index;
> > +     bool result = true;
> > +     s64 delta = 0;
> > +
> > +     if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
> > +             return true;
> > +
> > +     ddr = drmcgrp->dev_resources[devIdx];
> > +     mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> > +     for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
> > +             d = drmcgrp->dev_resources[devIdx];
> > +             delta = d->bo_limits_total_allocated -
> > +                             d->bo_stats_total_allocated;
> > +
> > +             if (delta <= 0 || size > delta) {
> > +                     result = false;
> > +                     break;
> > +             }
> > +     }
> > +     mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> > +
> > +     return result;
> > +}
> > +EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
> > +
> > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size)
> > +{
> > +     struct drmcgrp_device_resource *ddr;
> > +     int devIdx = dev->primary->index;
> > +
> > +     if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> > +             return;
> > +
> > +     mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> > +     for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
> > +             ddr = drmcgrp->dev_resources[devIdx];
> > +
> > +             ddr->bo_stats_total_allocated += (s64)size;
> > +     }
> > +     mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> > +}
> > +EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
> > +
> > +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size)
> > +{
> > +     struct drmcgrp_device_resource *ddr;
> > +     int devIdx = dev->primary->index;
> > +
> > +     if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> > +             return;
> > +
> > +     ddr = drmcgrp->dev_resources[devIdx];
> > +     mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> > +     for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
> > +             drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
> > +                     -= (s64)size;
> > +     mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> > +}
> > +EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem
  2019-05-10 12:31     ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Christian König
@ 2019-05-10 15:07         ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 15:07 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong, Kenny Ho, Brian Welty, amd-gfx, Alex Deucher,
	dri-devel, Tejun Heo, cgroups

On Fri, May 10, 2019 at 8:31 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> I think it is a good approach to try to add a global limit first and
> when that's working go ahead with limiting device specific resources.
What are some of the global drm resource limit/allocation that would
be useful to implement? I would be happy to dig into those.

Regards,
Kenny


> The only major issue I can see is on patch #4, see there for further
> details.
>
> Christian.
>
> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> > This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.
> >
> > Usage examples:
> > // set limit for card1 to 1GB
> > sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> > // set limit for card0 to 512MB
> > sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> >
> > v2:
> > * Removed the vendoring concepts
> > * Add limit to total buffer allocation
> > * Add limit to the maximum size of a buffer allocation
> >
> > TODO: process migration
> > TODO: documentations
> >
> > [a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> >
> > v1: cover letter
> >
> > The purpose of this patch series is to start a discussion for a generic cgroup
> > controller for the drm subsystem.  The design proposed here is a very early one.
> > We are hoping to engage the community as we develop the idea.
> >
> >
> > Backgrounds
> > ==========
> > Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
> > tasks, and all their future children, into hierarchical groups with specialized
> > behaviour, such as accounting/limiting the resources which processes in a cgroup
> > can access[1].  Weights, limits, protections, allocations are the main resource
> > distribution models.  Existing cgroup controllers includes cpu, memory, io,
> > rdma, and more.  cgroup is one of the foundational technologies that enables the
> > popular container application deployment and management method.
> >
> > Direct Rendering Manager/drm contains code intended to support the needs of
> > complex graphics devices. Graphics drivers in the kernel may make use of DRM
> > functions to make tasks like memory management, interrupt handling and DMA
> > easier, and provide a uniform interface to applications.  The DRM has also
> > developed beyond traditional graphics applications to support compute/GPGPU
> > applications.
> >
> >
> > Motivations
> > =========
> > As GPU grow beyond the realm of desktop/workstation graphics into areas like
> > data center clusters and IoT, there are increasing needs to monitor and regulate
> > GPU as a resource like cpu, memory and io.
> >
> > Matt Roper from Intel began working on similar idea in early 2018 [2] for the
> > purpose of managing GPU priority using the cgroup hierarchy.  While that
> > particular use case may not warrant a standalone drm cgroup controller, there
> > are other use cases where having one can be useful [3].  Monitoring GPU
> > resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
> > (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
> > sysadmins get a better understanding of the applications usage profile.  Further
> > usage regulations of the aforementioned resources can also help sysadmins
> > optimize workload deployment on limited GPU resources.
> >
> > With the increased importance of machine learning, data science and other
> > cloud-based applications, GPUs are already in production use in data centers
> > today [5,6,7].  Existing GPU resource management is very course grain, however,
> > as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
> > alternative is to use GPU virtualization (with or without SRIOV) but it
> > generally acts on the entire GPU instead of the specific resources in a GPU.
> > With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
> > resource management (in addition to what may be available via GPU
> > virtualization.)
> >
> > In addition to production use, the DRM cgroup can also help with testing
> > graphics application robustness by providing a mean to artificially limit DRM
> > resources availble to the applications.
> >
> > Challenges
> > ========
> > While there are common infrastructure in DRM that is shared across many vendors
> > (the scheduler [4] for example), there are also aspects of DRM that are vendor
> > specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
> > handle different kinds of cgroup controller.
> >
> > Resources for DRM are also often device (GPU) specific instead of system
> > specific and a system may contain more than one GPU.  For this, we borrowed some
> > of the ideas from RDMA cgroup controller.
> >
> > Approach
> > =======
> > To experiment with the idea of a DRM cgroup, we would like to start with basic
> > accounting and statistics, then continue to iterate and add regulating
> > mechanisms into the driver.
> >
> > [1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
> > [2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > [3] https://www.spinics.net/lists/cgroups/msg20720.html
> > [4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
> > [5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
> > [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
> > [7] https://github.com/RadeonOpenCompute/k8s-device-plugin
> > [8] https://github.com/kubernetes/kubernetes/issues/52757
> >
> > Kenny Ho (5):
> >    cgroup: Introduce cgroup for drm subsystem
> >    cgroup: Add mechanism to register DRM devices
> >    drm/amdgpu: Register AMD devices for DRM cgroup
> >    drm, cgroup: Add total GEM buffer allocation limit
> >    drm, cgroup: Add peak GEM buffer allocation limit
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
> >   drivers/gpu/drm/drm_gem.c                  |   7 +
> >   drivers/gpu/drm/drm_prime.c                |   9 +
> >   include/drm/drm_cgroup.h                   |  54 +++
> >   include/drm/drm_gem.h                      |  11 +
> >   include/linux/cgroup_drm.h                 |  47 ++
> >   include/linux/cgroup_subsys.h              |   4 +
> >   init/Kconfig                               |   5 +
> >   kernel/cgroup/Makefile                     |   1 +
> >   kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
> >   11 files changed, 643 insertions(+)
> >   create mode 100644 include/drm/drm_cgroup.h
> >   create mode 100644 include/linux/cgroup_drm.h
> >   create mode 100644 kernel/cgroup/drm.c
> >
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem
@ 2019-05-10 15:07         ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 15:07 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong, Kenny Ho, Brian Welty, amd-gfx, Alex Deucher,
	dri-devel, Tejun Heo, cgroups

On Fri, May 10, 2019 at 8:31 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> I think it is a good approach to try to add a global limit first and
> when that's working go ahead with limiting device specific resources.
What are some of the global drm resource limit/allocation that would
be useful to implement? I would be happy to dig into those.

Regards,
Kenny


> The only major issue I can see is on patch #4, see there for further
> details.
>
> Christian.
>
> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> > This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.
> >
> > Usage examples:
> > // set limit for card1 to 1GB
> > sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> > // set limit for card0 to 512MB
> > sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> >
> > v2:
> > * Removed the vendoring concepts
> > * Add limit to total buffer allocation
> > * Add limit to the maximum size of a buffer allocation
> >
> > TODO: process migration
> > TODO: documentations
> >
> > [a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> >
> > v1: cover letter
> >
> > The purpose of this patch series is to start a discussion for a generic cgroup
> > controller for the drm subsystem.  The design proposed here is a very early one.
> > We are hoping to engage the community as we develop the idea.
> >
> >
> > Backgrounds
> > ==========
> > Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
> > tasks, and all their future children, into hierarchical groups with specialized
> > behaviour, such as accounting/limiting the resources which processes in a cgroup
> > can access[1].  Weights, limits, protections, allocations are the main resource
> > distribution models.  Existing cgroup controllers includes cpu, memory, io,
> > rdma, and more.  cgroup is one of the foundational technologies that enables the
> > popular container application deployment and management method.
> >
> > Direct Rendering Manager/drm contains code intended to support the needs of
> > complex graphics devices. Graphics drivers in the kernel may make use of DRM
> > functions to make tasks like memory management, interrupt handling and DMA
> > easier, and provide a uniform interface to applications.  The DRM has also
> > developed beyond traditional graphics applications to support compute/GPGPU
> > applications.
> >
> >
> > Motivations
> > =========
> > As GPU grow beyond the realm of desktop/workstation graphics into areas like
> > data center clusters and IoT, there are increasing needs to monitor and regulate
> > GPU as a resource like cpu, memory and io.
> >
> > Matt Roper from Intel began working on similar idea in early 2018 [2] for the
> > purpose of managing GPU priority using the cgroup hierarchy.  While that
> > particular use case may not warrant a standalone drm cgroup controller, there
> > are other use cases where having one can be useful [3].  Monitoring GPU
> > resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
> > (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
> > sysadmins get a better understanding of the applications usage profile.  Further
> > usage regulations of the aforementioned resources can also help sysadmins
> > optimize workload deployment on limited GPU resources.
> >
> > With the increased importance of machine learning, data science and other
> > cloud-based applications, GPUs are already in production use in data centers
> > today [5,6,7].  Existing GPU resource management is very course grain, however,
> > as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
> > alternative is to use GPU virtualization (with or without SRIOV) but it
> > generally acts on the entire GPU instead of the specific resources in a GPU.
> > With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
> > resource management (in addition to what may be available via GPU
> > virtualization.)
> >
> > In addition to production use, the DRM cgroup can also help with testing
> > graphics application robustness by providing a mean to artificially limit DRM
> > resources availble to the applications.
> >
> > Challenges
> > ========
> > While there are common infrastructure in DRM that is shared across many vendors
> > (the scheduler [4] for example), there are also aspects of DRM that are vendor
> > specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
> > handle different kinds of cgroup controller.
> >
> > Resources for DRM are also often device (GPU) specific instead of system
> > specific and a system may contain more than one GPU.  For this, we borrowed some
> > of the ideas from RDMA cgroup controller.
> >
> > Approach
> > =======
> > To experiment with the idea of a DRM cgroup, we would like to start with basic
> > accounting and statistics, then continue to iterate and add regulating
> > mechanisms into the driver.
> >
> > [1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
> > [2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > [3] https://www.spinics.net/lists/cgroups/msg20720.html
> > [4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
> > [5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
> > [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
> > [7] https://github.com/RadeonOpenCompute/k8s-device-plugin
> > [8] https://github.com/kubernetes/kubernetes/issues/52757
> >
> > Kenny Ho (5):
> >    cgroup: Introduce cgroup for drm subsystem
> >    cgroup: Add mechanism to register DRM devices
> >    drm/amdgpu: Register AMD devices for DRM cgroup
> >    drm, cgroup: Add total GEM buffer allocation limit
> >    drm, cgroup: Add peak GEM buffer allocation limit
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
> >   drivers/gpu/drm/drm_gem.c                  |   7 +
> >   drivers/gpu/drm/drm_prime.c                |   9 +
> >   include/drm/drm_cgroup.h                   |  54 +++
> >   include/drm/drm_gem.h                      |  11 +
> >   include/linux/cgroup_drm.h                 |  47 ++
> >   include/linux/cgroup_subsys.h              |   4 +
> >   init/Kconfig                               |   5 +
> >   kernel/cgroup/Makefile                     |   1 +
> >   kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
> >   11 files changed, 643 insertions(+)
> >   create mode 100644 include/drm/drm_cgroup.h
> >   create mode 100644 include/linux/cgroup_drm.h
> >   create mode 100644 kernel/cgroup/drm.c
> >
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-10 14:57             ` Kenny Ho
@ 2019-05-10 15:08               ` Koenig, Christian
  -1 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-10 15:08 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong, Ho, Kenny, Brian Welty, amd-gfx, Deucher, Alexander,
	dri-devel, Tejun Heo, cgroups

Am 10.05.19 um 16:57 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 8:28 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 09.05.19 um 23:04 schrieb Kenny Ho:
>>> +     /* only allow bo from the same cgroup or its ancestor to be imported */
>>> +     if (drmcgrp != NULL &&
>>> +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
>>> +             ret = -EACCES;
>>> +             goto out_unlock;
>>> +     }
>>> +
>> This will most likely go up in flames.
>>
>> If I'm not completely mistaken we already use
>> drm_gem_prime_fd_to_handle() to exchange handles between different
>> cgroups in current container usages.
> This is something that I am interested in getting more details from
> the broader community because the details affect how likely this will
> go up in flames ;).  Note that this check does not block sharing of
> handles from cgroup parent to children in the hierarchy, nor does it
> blocks sharing of handles within a cgroup.
>
> I am interested to find out, when existing apps share handles between
> containers, if there are any expectations on resource management.
> Since there are no drm cgroup for current container usage, I expect
> the answer to be no.  In this case, the drm cgroup controller can be
> disabled on its own (in the context of cgroup-v2's unified hierarchy),
> or the process can remain at the root for the drm cgroup hierarchy (in
> the context of cgroup-v1.)  If I understand the cgroup api correctly,
> that means all process would be part of the root cgroup as far as the
> drm controller is concerned and this block will not come into effect.
> I have verified that this is indeed the current default behaviour of a
> container runtime (runc, which is used by docker, podman and others.)
> The new drm cgroup controller is simply ignored and all processes
> remain at the root of the hierarchy (since there are no other
> cgroups.)  I plan to make contributions to runc (so folks can actually
> use this features with docker/podman/k8s, etc.) once things stabilized
> on the kernel side.

So the drm cgroup container is separate to other cgroup containers?

In other words as long as userspace doesn't change, this wouldn't have 
any effect?

Well that is unexpected cause then a processes would be in different 
groups for different controllers, but if that's really the case that 
would certainly work.

> On the other hand, if there are expectations for resource management
> between containers, I would like to know who is the expected manager
> and how does it fit into the concept of container (which enforce some
> level of isolation.)  One possible manager may be the display server.
> But as long as the display server is in a parent cgroup of the apps'
> cgroup, the apps can still import handles from the display server
> under the current implementation.  My understanding is that this is
> most likely the case, with the display server simply sitting at the
> default/root cgroup.  But I certainly want to hear more about other
> use cases (for example, is running multiple display servers on a
> single host a realistic possibility?  Are there people running
> multiple display servers inside peer containers?  If so, how do they
> coordinate resources?)

We definitely have situations with multiple display servers running 
(just think of VR).

I just can't say if they currently use cgroups in any way.

Thanks,
Christian.

>
> I should probably summarize some of these into the commit message.
>
> Regards,
> Kenny
>
>
>
>> Christian.
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-10 15:08               ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-10 15:08 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong, Ho, Kenny, Brian Welty, amd-gfx, Deucher, Alexander,
	dri-devel, Tejun Heo, cgroups

Am 10.05.19 um 16:57 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 8:28 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 09.05.19 um 23:04 schrieb Kenny Ho:
>>> +     /* only allow bo from the same cgroup or its ancestor to be imported */
>>> +     if (drmcgrp != NULL &&
>>> +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
>>> +             ret = -EACCES;
>>> +             goto out_unlock;
>>> +     }
>>> +
>> This will most likely go up in flames.
>>
>> If I'm not completely mistaken we already use
>> drm_gem_prime_fd_to_handle() to exchange handles between different
>> cgroups in current container usages.
> This is something that I am interested in getting more details from
> the broader community because the details affect how likely this will
> go up in flames ;).  Note that this check does not block sharing of
> handles from cgroup parent to children in the hierarchy, nor does it
> blocks sharing of handles within a cgroup.
>
> I am interested to find out, when existing apps share handles between
> containers, if there are any expectations on resource management.
> Since there are no drm cgroup for current container usage, I expect
> the answer to be no.  In this case, the drm cgroup controller can be
> disabled on its own (in the context of cgroup-v2's unified hierarchy),
> or the process can remain at the root for the drm cgroup hierarchy (in
> the context of cgroup-v1.)  If I understand the cgroup api correctly,
> that means all process would be part of the root cgroup as far as the
> drm controller is concerned and this block will not come into effect.
> I have verified that this is indeed the current default behaviour of a
> container runtime (runc, which is used by docker, podman and others.)
> The new drm cgroup controller is simply ignored and all processes
> remain at the root of the hierarchy (since there are no other
> cgroups.)  I plan to make contributions to runc (so folks can actually
> use this features with docker/podman/k8s, etc.) once things stabilized
> on the kernel side.

So the drm cgroup container is separate to other cgroup containers?

In other words as long as userspace doesn't change, this wouldn't have 
any effect?

Well that is unexpected cause then a processes would be in different 
groups for different controllers, but if that's really the case that 
would certainly work.

> On the other hand, if there are expectations for resource management
> between containers, I would like to know who is the expected manager
> and how does it fit into the concept of container (which enforce some
> level of isolation.)  One possible manager may be the display server.
> But as long as the display server is in a parent cgroup of the apps'
> cgroup, the apps can still import handles from the display server
> under the current implementation.  My understanding is that this is
> most likely the case, with the display server simply sitting at the
> default/root cgroup.  But I certainly want to hear more about other
> use cases (for example, is running multiple display servers on a
> single host a realistic possibility?  Are there people running
> multiple display servers inside peer containers?  If so, how do they
> coordinate resources?)

We definitely have situations with multiple display servers running 
(just think of VR).

I just can't say if they currently use cgroups in any way.

Thanks,
Christian.

>
> I should probably summarize some of these into the commit message.
>
> Regards,
> Kenny
>
>
>
>> Christian.
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]               ` <1ca1363e-b39c-c299-1d24-098b1059f7ff-5C7GfCeVMHo@public.gmane.org>
@ 2019-05-10 15:25                   ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 15:25 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Fri, May 10, 2019 at 11:08 AM Koenig, Christian
<Christian.Koenig@amd.com> wrote:
> Am 10.05.19 um 16:57 schrieb Kenny Ho:
> > On Fri, May 10, 2019 at 8:28 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> So the drm cgroup container is separate to other cgroup containers?
In cgroup-v1, which is most widely deployed currently, all controllers
have their own hierarchy (see /sys/fs/cgroup/).  In cgroup-v2, the
hierarchy is unified by individual controllers can be disabled (I
believe, I am not super familiar with v2.)

> In other words as long as userspace doesn't change, this wouldn't have
> any effect?
As far as things like docker and podman is concern, yes.  I am not
sure about the behaviour of others like lxc, lxd, etc. because I
haven't used those myself.

> Well that is unexpected cause then a processes would be in different
> groups for different controllers, but if that's really the case that
> would certainly work.
I believe this is a possibility for v1 and is why folks came up with
the unified hierarchy in v2 to solve some of the issues.
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#issues-with-v1-and-rationales-for-v2

Regards,
Kenny

> > On the other hand, if there are expectations for resource management
> > between containers, I would like to know who is the expected manager
> > and how does it fit into the concept of container (which enforce some
> > level of isolation.)  One possible manager may be the display server.
> > But as long as the display server is in a parent cgroup of the apps'
> > cgroup, the apps can still import handles from the display server
> > under the current implementation.  My understanding is that this is
> > most likely the case, with the display server simply sitting at the
> > default/root cgroup.  But I certainly want to hear more about other
> > use cases (for example, is running multiple display servers on a
> > single host a realistic possibility?  Are there people running
> > multiple display servers inside peer containers?  If so, how do they
> > coordinate resources?)
>
> We definitely have situations with multiple display servers running
> (just think of VR).
>
> I just can't say if they currently use cgroups in any way.
>
> Thanks,
> Christian.
>
> >
> > I should probably summarize some of these into the commit message.
> >
> > Regards,
> > Kenny
> >
> >
> >
> >> Christian.
> >>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-10 15:25                   ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 15:25 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Fri, May 10, 2019 at 11:08 AM Koenig, Christian
<Christian.Koenig@amd.com> wrote:
> Am 10.05.19 um 16:57 schrieb Kenny Ho:
> > On Fri, May 10, 2019 at 8:28 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> So the drm cgroup container is separate to other cgroup containers?
In cgroup-v1, which is most widely deployed currently, all controllers
have their own hierarchy (see /sys/fs/cgroup/).  In cgroup-v2, the
hierarchy is unified by individual controllers can be disabled (I
believe, I am not super familiar with v2.)

> In other words as long as userspace doesn't change, this wouldn't have
> any effect?
As far as things like docker and podman is concern, yes.  I am not
sure about the behaviour of others like lxc, lxd, etc. because I
haven't used those myself.

> Well that is unexpected cause then a processes would be in different
> groups for different controllers, but if that's really the case that
> would certainly work.
I believe this is a possibility for v1 and is why folks came up with
the unified hierarchy in v2 to solve some of the issues.
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#issues-with-v1-and-rationales-for-v2

Regards,
Kenny

> > On the other hand, if there are expectations for resource management
> > between containers, I would like to know who is the expected manager
> > and how does it fit into the concept of container (which enforce some
> > level of isolation.)  One possible manager may be the display server.
> > But as long as the display server is in a parent cgroup of the apps'
> > cgroup, the apps can still import handles from the display server
> > under the current implementation.  My understanding is that this is
> > most likely the case, with the display server simply sitting at the
> > default/root cgroup.  But I certainly want to hear more about other
> > use cases (for example, is running multiple display servers on a
> > single host a realistic possibility?  Are there people running
> > multiple display servers inside peer containers?  If so, how do they
> > coordinate resources?)
>
> We definitely have situations with multiple display servers running
> (just think of VR).
>
> I just can't say if they currently use cgroups in any way.
>
> Thanks,
> Christian.
>
> >
> > I should probably summarize some of these into the commit message.
> >
> > Regards,
> > Kenny
> >
> >
> >
> >> Christian.
> >>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem
       [not found]         ` <CAOWid-dJZrnAifFYByh4p9x-jA1o_5YWkoNVAVbdRUaxzdPbGA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-10 17:46             ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-10 17:46 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

Am 10.05.19 um 17:07 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 8:31 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> I think it is a good approach to try to add a global limit first and
>> when that's working go ahead with limiting device specific resources.
> What are some of the global drm resource limit/allocation that would
> be useful to implement? I would be happy to dig into those.

I was thinking about device specific stuff like VRAM etc...

What I'm also not clear about is how this should interact with memcg. 
E.g. do we also need to account BOs in memcg?

In theory I would say yes.

Christian.

>
> Regards,
> Kenny
>
>
>> The only major issue I can see is on patch #4, see there for further
>> details.
>>
>> Christian.
>>
>> Am 09.05.19 um 23:04 schrieb Kenny Ho:
>>> This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.
>>>
>>> Usage examples:
>>> // set limit for card1 to 1GB
>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>
>>> // set limit for card0 to 512MB
>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>
>>>
>>> v2:
>>> * Removed the vendoring concepts
>>> * Add limit to total buffer allocation
>>> * Add limit to the maximum size of a buffer allocation
>>>
>>> TODO: process migration
>>> TODO: documentations
>>>
>>> [a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
>>>
>>> v1: cover letter
>>>
>>> The purpose of this patch series is to start a discussion for a generic cgroup
>>> controller for the drm subsystem.  The design proposed here is a very early one.
>>> We are hoping to engage the community as we develop the idea.
>>>
>>>
>>> Backgrounds
>>> ==========
>>> Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
>>> tasks, and all their future children, into hierarchical groups with specialized
>>> behaviour, such as accounting/limiting the resources which processes in a cgroup
>>> can access[1].  Weights, limits, protections, allocations are the main resource
>>> distribution models.  Existing cgroup controllers includes cpu, memory, io,
>>> rdma, and more.  cgroup is one of the foundational technologies that enables the
>>> popular container application deployment and management method.
>>>
>>> Direct Rendering Manager/drm contains code intended to support the needs of
>>> complex graphics devices. Graphics drivers in the kernel may make use of DRM
>>> functions to make tasks like memory management, interrupt handling and DMA
>>> easier, and provide a uniform interface to applications.  The DRM has also
>>> developed beyond traditional graphics applications to support compute/GPGPU
>>> applications.
>>>
>>>
>>> Motivations
>>> =========
>>> As GPU grow beyond the realm of desktop/workstation graphics into areas like
>>> data center clusters and IoT, there are increasing needs to monitor and regulate
>>> GPU as a resource like cpu, memory and io.
>>>
>>> Matt Roper from Intel began working on similar idea in early 2018 [2] for the
>>> purpose of managing GPU priority using the cgroup hierarchy.  While that
>>> particular use case may not warrant a standalone drm cgroup controller, there
>>> are other use cases where having one can be useful [3].  Monitoring GPU
>>> resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
>>> (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
>>> sysadmins get a better understanding of the applications usage profile.  Further
>>> usage regulations of the aforementioned resources can also help sysadmins
>>> optimize workload deployment on limited GPU resources.
>>>
>>> With the increased importance of machine learning, data science and other
>>> cloud-based applications, GPUs are already in production use in data centers
>>> today [5,6,7].  Existing GPU resource management is very course grain, however,
>>> as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
>>> alternative is to use GPU virtualization (with or without SRIOV) but it
>>> generally acts on the entire GPU instead of the specific resources in a GPU.
>>> With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
>>> resource management (in addition to what may be available via GPU
>>> virtualization.)
>>>
>>> In addition to production use, the DRM cgroup can also help with testing
>>> graphics application robustness by providing a mean to artificially limit DRM
>>> resources availble to the applications.
>>>
>>> Challenges
>>> ========
>>> While there are common infrastructure in DRM that is shared across many vendors
>>> (the scheduler [4] for example), there are also aspects of DRM that are vendor
>>> specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
>>> handle different kinds of cgroup controller.
>>>
>>> Resources for DRM are also often device (GPU) specific instead of system
>>> specific and a system may contain more than one GPU.  For this, we borrowed some
>>> of the ideas from RDMA cgroup controller.
>>>
>>> Approach
>>> =======
>>> To experiment with the idea of a DRM cgroup, we would like to start with basic
>>> accounting and statistics, then continue to iterate and add regulating
>>> mechanisms into the driver.
>>>
>>> [1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
>>> [2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
>>> [3] https://www.spinics.net/lists/cgroups/msg20720.html
>>> [4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
>>> [5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
>>> [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
>>> [7] https://github.com/RadeonOpenCompute/k8s-device-plugin
>>> [8] https://github.com/kubernetes/kubernetes/issues/52757
>>>
>>> Kenny Ho (5):
>>>     cgroup: Introduce cgroup for drm subsystem
>>>     cgroup: Add mechanism to register DRM devices
>>>     drm/amdgpu: Register AMD devices for DRM cgroup
>>>     drm, cgroup: Add total GEM buffer allocation limit
>>>     drm, cgroup: Add peak GEM buffer allocation limit
>>>
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
>>>    drivers/gpu/drm/drm_gem.c                  |   7 +
>>>    drivers/gpu/drm/drm_prime.c                |   9 +
>>>    include/drm/drm_cgroup.h                   |  54 +++
>>>    include/drm/drm_gem.h                      |  11 +
>>>    include/linux/cgroup_drm.h                 |  47 ++
>>>    include/linux/cgroup_subsys.h              |   4 +
>>>    init/Kconfig                               |   5 +
>>>    kernel/cgroup/Makefile                     |   1 +
>>>    kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
>>>    11 files changed, 643 insertions(+)
>>>    create mode 100644 include/drm/drm_cgroup.h
>>>    create mode 100644 include/linux/cgroup_drm.h
>>>    create mode 100644 kernel/cgroup/drm.c
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem
@ 2019-05-10 17:46             ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-10 17:46 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

Am 10.05.19 um 17:07 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 8:31 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> I think it is a good approach to try to add a global limit first and
>> when that's working go ahead with limiting device specific resources.
> What are some of the global drm resource limit/allocation that would
> be useful to implement? I would be happy to dig into those.

I was thinking about device specific stuff like VRAM etc...

What I'm also not clear about is how this should interact with memcg. 
E.g. do we also need to account BOs in memcg?

In theory I would say yes.

Christian.

>
> Regards,
> Kenny
>
>
>> The only major issue I can see is on patch #4, see there for further
>> details.
>>
>> Christian.
>>
>> Am 09.05.19 um 23:04 schrieb Kenny Ho:
>>> This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.
>>>
>>> Usage examples:
>>> // set limit for card1 to 1GB
>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>
>>> // set limit for card0 to 512MB
>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>
>>>
>>> v2:
>>> * Removed the vendoring concepts
>>> * Add limit to total buffer allocation
>>> * Add limit to the maximum size of a buffer allocation
>>>
>>> TODO: process migration
>>> TODO: documentations
>>>
>>> [a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
>>>
>>> v1: cover letter
>>>
>>> The purpose of this patch series is to start a discussion for a generic cgroup
>>> controller for the drm subsystem.  The design proposed here is a very early one.
>>> We are hoping to engage the community as we develop the idea.
>>>
>>>
>>> Backgrounds
>>> ==========
>>> Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
>>> tasks, and all their future children, into hierarchical groups with specialized
>>> behaviour, such as accounting/limiting the resources which processes in a cgroup
>>> can access[1].  Weights, limits, protections, allocations are the main resource
>>> distribution models.  Existing cgroup controllers includes cpu, memory, io,
>>> rdma, and more.  cgroup is one of the foundational technologies that enables the
>>> popular container application deployment and management method.
>>>
>>> Direct Rendering Manager/drm contains code intended to support the needs of
>>> complex graphics devices. Graphics drivers in the kernel may make use of DRM
>>> functions to make tasks like memory management, interrupt handling and DMA
>>> easier, and provide a uniform interface to applications.  The DRM has also
>>> developed beyond traditional graphics applications to support compute/GPGPU
>>> applications.
>>>
>>>
>>> Motivations
>>> =========
>>> As GPU grow beyond the realm of desktop/workstation graphics into areas like
>>> data center clusters and IoT, there are increasing needs to monitor and regulate
>>> GPU as a resource like cpu, memory and io.
>>>
>>> Matt Roper from Intel began working on similar idea in early 2018 [2] for the
>>> purpose of managing GPU priority using the cgroup hierarchy.  While that
>>> particular use case may not warrant a standalone drm cgroup controller, there
>>> are other use cases where having one can be useful [3].  Monitoring GPU
>>> resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
>>> (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
>>> sysadmins get a better understanding of the applications usage profile.  Further
>>> usage regulations of the aforementioned resources can also help sysadmins
>>> optimize workload deployment on limited GPU resources.
>>>
>>> With the increased importance of machine learning, data science and other
>>> cloud-based applications, GPUs are already in production use in data centers
>>> today [5,6,7].  Existing GPU resource management is very course grain, however,
>>> as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
>>> alternative is to use GPU virtualization (with or without SRIOV) but it
>>> generally acts on the entire GPU instead of the specific resources in a GPU.
>>> With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
>>> resource management (in addition to what may be available via GPU
>>> virtualization.)
>>>
>>> In addition to production use, the DRM cgroup can also help with testing
>>> graphics application robustness by providing a mean to artificially limit DRM
>>> resources availble to the applications.
>>>
>>> Challenges
>>> ========
>>> While there are common infrastructure in DRM that is shared across many vendors
>>> (the scheduler [4] for example), there are also aspects of DRM that are vendor
>>> specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
>>> handle different kinds of cgroup controller.
>>>
>>> Resources for DRM are also often device (GPU) specific instead of system
>>> specific and a system may contain more than one GPU.  For this, we borrowed some
>>> of the ideas from RDMA cgroup controller.
>>>
>>> Approach
>>> =======
>>> To experiment with the idea of a DRM cgroup, we would like to start with basic
>>> accounting and statistics, then continue to iterate and add regulating
>>> mechanisms into the driver.
>>>
>>> [1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
>>> [2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
>>> [3] https://www.spinics.net/lists/cgroups/msg20720.html
>>> [4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
>>> [5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
>>> [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
>>> [7] https://github.com/RadeonOpenCompute/k8s-device-plugin
>>> [8] https://github.com/kubernetes/kubernetes/issues/52757
>>>
>>> Kenny Ho (5):
>>>     cgroup: Introduce cgroup for drm subsystem
>>>     cgroup: Add mechanism to register DRM devices
>>>     drm/amdgpu: Register AMD devices for DRM cgroup
>>>     drm, cgroup: Add total GEM buffer allocation limit
>>>     drm, cgroup: Add peak GEM buffer allocation limit
>>>
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
>>>    drivers/gpu/drm/drm_gem.c                  |   7 +
>>>    drivers/gpu/drm/drm_prime.c                |   9 +
>>>    include/drm/drm_cgroup.h                   |  54 +++
>>>    include/drm/drm_gem.h                      |  11 +
>>>    include/linux/cgroup_drm.h                 |  47 ++
>>>    include/linux/cgroup_subsys.h              |   4 +
>>>    init/Kconfig                               |   5 +
>>>    kernel/cgroup/Makefile                     |   1 +
>>>    kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
>>>    11 files changed, 643 insertions(+)
>>>    create mode 100644 include/drm/drm_cgroup.h
>>>    create mode 100644 include/linux/cgroup_drm.h
>>>    create mode 100644 kernel/cgroup/drm.c
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-10 15:25                   ` Kenny Ho
@ 2019-05-10 17:48                     ` Koenig, Christian
  -1 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-10 17:48 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong, Ho, Kenny, Brian Welty, amd-gfx, Deucher, Alexander,
	dri-devel, Tejun Heo, cgroups

Am 10.05.19 um 17:25 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 11:08 AM Koenig, Christian
> <Christian.Koenig@amd.com> wrote:
>> Am 10.05.19 um 16:57 schrieb Kenny Ho:
>>> On Fri, May 10, 2019 at 8:28 AM Christian König
>>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>> Am 09.05.19 um 23:04 schrieb Kenny Ho:
>> So the drm cgroup container is separate to other cgroup containers?
> In cgroup-v1, which is most widely deployed currently, all controllers
> have their own hierarchy (see /sys/fs/cgroup/).  In cgroup-v2, the
> hierarchy is unified by individual controllers can be disabled (I
> believe, I am not super familiar with v2.)
>
>> In other words as long as userspace doesn't change, this wouldn't have
>> any effect?
> As far as things like docker and podman is concern, yes.  I am not
> sure about the behaviour of others like lxc, lxd, etc. because I
> haven't used those myself.
>
>> Well that is unexpected cause then a processes would be in different
>> groups for different controllers, but if that's really the case that
>> would certainly work.
> I believe this is a possibility for v1 and is why folks came up with
> the unified hierarchy in v2 to solve some of the issues.
> https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#issues-with-v1-and-rationales-for-v2

Well another question is why do we want to prevent that in the first place?

I mean the worst thing that can happen is that we account a BO multiple 
times.

And going into the same direction where is the code to handle an open 
device file descriptor which is send from one cgroup to another?

Regards,
Christian.

>
> Regards,
> Kenny
>
>>> On the other hand, if there are expectations for resource management
>>> between containers, I would like to know who is the expected manager
>>> and how does it fit into the concept of container (which enforce some
>>> level of isolation.)  One possible manager may be the display server.
>>> But as long as the display server is in a parent cgroup of the apps'
>>> cgroup, the apps can still import handles from the display server
>>> under the current implementation.  My understanding is that this is
>>> most likely the case, with the display server simply sitting at the
>>> default/root cgroup.  But I certainly want to hear more about other
>>> use cases (for example, is running multiple display servers on a
>>> single host a realistic possibility?  Are there people running
>>> multiple display servers inside peer containers?  If so, how do they
>>> coordinate resources?)
>> We definitely have situations with multiple display servers running
>> (just think of VR).
>>
>> I just can't say if they currently use cgroups in any way.
>>
>> Thanks,
>> Christian.
>>
>>> I should probably summarize some of these into the commit message.
>>>
>>> Regards,
>>> Kenny
>>>
>>>
>>>
>>>> Christian.
>>>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-10 17:48                     ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-10 17:48 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong, Ho, Kenny, Brian Welty, amd-gfx, Deucher, Alexander,
	dri-devel, Tejun Heo, cgroups

Am 10.05.19 um 17:25 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 11:08 AM Koenig, Christian
> <Christian.Koenig@amd.com> wrote:
>> Am 10.05.19 um 16:57 schrieb Kenny Ho:
>>> On Fri, May 10, 2019 at 8:28 AM Christian König
>>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>> Am 09.05.19 um 23:04 schrieb Kenny Ho:
>> So the drm cgroup container is separate to other cgroup containers?
> In cgroup-v1, which is most widely deployed currently, all controllers
> have their own hierarchy (see /sys/fs/cgroup/).  In cgroup-v2, the
> hierarchy is unified by individual controllers can be disabled (I
> believe, I am not super familiar with v2.)
>
>> In other words as long as userspace doesn't change, this wouldn't have
>> any effect?
> As far as things like docker and podman is concern, yes.  I am not
> sure about the behaviour of others like lxc, lxd, etc. because I
> haven't used those myself.
>
>> Well that is unexpected cause then a processes would be in different
>> groups for different controllers, but if that's really the case that
>> would certainly work.
> I believe this is a possibility for v1 and is why folks came up with
> the unified hierarchy in v2 to solve some of the issues.
> https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#issues-with-v1-and-rationales-for-v2

Well another question is why do we want to prevent that in the first place?

I mean the worst thing that can happen is that we account a BO multiple 
times.

And going into the same direction where is the code to handle an open 
device file descriptor which is send from one cgroup to another?

Regards,
Christian.

>
> Regards,
> Kenny
>
>>> On the other hand, if there are expectations for resource management
>>> between containers, I would like to know who is the expected manager
>>> and how does it fit into the concept of container (which enforce some
>>> level of isolation.)  One possible manager may be the display server.
>>> But as long as the display server is in a parent cgroup of the apps'
>>> cgroup, the apps can still import handles from the display server
>>> under the current implementation.  My understanding is that this is
>>> most likely the case, with the display server simply sitting at the
>>> default/root cgroup.  But I certainly want to hear more about other
>>> use cases (for example, is running multiple display servers on a
>>> single host a realistic possibility?  Are there people running
>>> multiple display servers inside peer containers?  If so, how do they
>>> coordinate resources?)
>> We definitely have situations with multiple display servers running
>> (just think of VR).
>>
>> I just can't say if they currently use cgroups in any way.
>>
>> Thanks,
>> Christian.
>>
>>> I should probably summarize some of these into the commit message.
>>>
>>> Regards,
>>> Kenny
>>>
>>>
>>>
>>>> Christian.
>>>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-10 17:48                     ` Koenig, Christian
  (?)
@ 2019-05-10 18:50                     ` Kenny Ho
       [not found]                       ` <CAOWid-es+C_iStQUkM52mO3TeP8eS9MX+emZDQNH2PyZCf=RHQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  -1 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2019-05-10 18:50 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: sunnanyong, Ho, Kenny, Brian Welty, amd-gfx, Deucher, Alexander,
	dri-devel, Tejun Heo, cgroups

On Fri, May 10, 2019 at 1:48 PM Koenig, Christian
<Christian.Koenig@amd.com> wrote:
> Well another question is why do we want to prevent that in the first place?
>
> I mean the worst thing that can happen is that we account a BO multiple
> times.
That's one of the problems.  The other one is the BO outliving the
lifetime of a cgroup and there's no good way to un-charge the usage
when the BO is free so the count won't be accurate.

I have looked into two possible solutions.  One is to prevent cgroup
from being removed when there are BOs owned by the cgroup still alive
(similar to how cgroup removal will fail if it still has processes
attached to it.)  My concern here is the possibility of not able to
remove a cgroup forever due to the lifetime of a BO (continuously
being shared and reuse and never die.)  Perhaps you can shed some
light on this possibility.

The other one is to keep track of all the buffers and migrate them to
the parent if a cgroup is closed.  My concern here is the performance
overhead on tracking all the buffers.

> And going into the same direction where is the code to handle an open
> device file descriptor which is send from one cgroup to another?
I looked into this before but I forgot what I found.  Perhaps folks
familiar with device cgroup can chime in.

Actually, just did another quick search right now.  Looks like the
access is enforced at the inode level (__devcgroup_check_permission)
so the fd sent to another cgroup that does not have access to the
device should still not have access.

Regards,
Kenny


> Regards,
> Christian.
>
> >
> > Regards,
> > Kenny
> >
> >>> On the other hand, if there are expectations for resource management
> >>> between containers, I would like to know who is the expected manager
> >>> and how does it fit into the concept of container (which enforce some
> >>> level of isolation.)  One possible manager may be the display server.
> >>> But as long as the display server is in a parent cgroup of the apps'
> >>> cgroup, the apps can still import handles from the display server
> >>> under the current implementation.  My understanding is that this is
> >>> most likely the case, with the display server simply sitting at the
> >>> default/root cgroup.  But I certainly want to hear more about other
> >>> use cases (for example, is running multiple display servers on a
> >>> single host a realistic possibility?  Are there people running
> >>> multiple display servers inside peer containers?  If so, how do they
> >>> coordinate resources?)
> >> We definitely have situations with multiple display servers running
> >> (just think of VR).
> >>
> >> I just can't say if they currently use cgroups in any way.
> >>
> >> Thanks,
> >> Christian.
> >>
> >>> I should probably summarize some of these into the commit message.
> >>>
> >>> Regards,
> >>> Kenny
> >>>
> >>>
> >>>
> >>>> Christian.
> >>>>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]                       ` <CAOWid-es+C_iStQUkM52mO3TeP8eS9MX+emZDQNH2PyZCf=RHQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-13 15:10                         ` Daniel Vetter
  0 siblings, 0 replies; 80+ messages in thread
From: Daniel Vetter @ 2019-05-13 15:10 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Brian Welty,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	cgroups-u79uwXL29TY76Z2rM5mHXA, Koenig, Christian

On Fri, May 10, 2019 at 02:50:39PM -0400, Kenny Ho wrote:
> On Fri, May 10, 2019 at 1:48 PM Koenig, Christian
> <Christian.Koenig@amd.com> wrote:
> > Well another question is why do we want to prevent that in the first place?
> >
> > I mean the worst thing that can happen is that we account a BO multiple
> > times.
> That's one of the problems.  The other one is the BO outliving the
> lifetime of a cgroup and there's no good way to un-charge the usage
> when the BO is free so the count won't be accurate.
> 
> I have looked into two possible solutions.  One is to prevent cgroup
> from being removed when there are BOs owned by the cgroup still alive
> (similar to how cgroup removal will fail if it still has processes
> attached to it.)  My concern here is the possibility of not able to
> remove a cgroup forever due to the lifetime of a BO (continuously
> being shared and reuse and never die.)  Perhaps you can shed some
> light on this possibility.
> 
> The other one is to keep track of all the buffers and migrate them to
> the parent if a cgroup is closed.  My concern here is the performance
> overhead on tracking all the buffers.

My understanding is that other cgroups already use reference counting to
make sure the data structure in the kernel doesn't disappear too early. So
you can delete the cgroup, but it might not get freed completely until all
the BO allocated from that cgroup are released. There's a recent lwn
article on how that's not all that awesome for the memory cgroup
controller, and what to do about it:

https://lwn.net/Articles/787614/

We probably want to align with whatever the mem cgroup folks come up with
(so _not_ prevent deletion of the cgroup, since that's different
behaviour).

> > And going into the same direction where is the code to handle an open
> > device file descriptor which is send from one cgroup to another?
> I looked into this before but I forgot what I found.  Perhaps folks
> familiar with device cgroup can chime in.
> 
> Actually, just did another quick search right now.  Looks like the
> access is enforced at the inode level (__devcgroup_check_permission)
> so the fd sent to another cgroup that does not have access to the
> device should still not have access.

That's the device cgroup, not the memory accounting stuff.

Imo for memory allocations we should look at what happens when you pass a
tempfs file around to another cgroup and then extend it there. I think
those allocations are charged against the cgroup which actually allocates
stuff.

So for drm, if you pass around a device fd, then we always charge ioctl
calls to create a BO against the process doing the ioctl call, not against
the process which originally opened the device fd. For e.g. DRI3 that's
actually the only reasonable thing to do, since otherwise we'd charge
everything against the Xserver.
-Daniel

> 
> Regards,
> Kenny
> 
> 
> > Regards,
> > Christian.
> >
> > >
> > > Regards,
> > > Kenny
> > >
> > >>> On the other hand, if there are expectations for resource management
> > >>> between containers, I would like to know who is the expected manager
> > >>> and how does it fit into the concept of container (which enforce some
> > >>> level of isolation.)  One possible manager may be the display server.
> > >>> But as long as the display server is in a parent cgroup of the apps'
> > >>> cgroup, the apps can still import handles from the display server
> > >>> under the current implementation.  My understanding is that this is
> > >>> most likely the case, with the display server simply sitting at the
> > >>> default/root cgroup.  But I certainly want to hear more about other
> > >>> use cases (for example, is running multiple display servers on a
> > >>> single host a realistic possibility?  Are there people running
> > >>> multiple display servers inside peer containers?  If so, how do they
> > >>> coordinate resources?)
> > >> We definitely have situations with multiple display servers running
> > >> (just think of VR).
> > >>
> > >> I just can't say if they currently use cgroups in any way.
> > >>
> > >> Thanks,
> > >> Christian.
> > >>
> > >>> I should probably summarize some of these into the commit message.
> > >>>
> > >>> Regards,
> > >>> Kenny
> > >>>
> > >>>
> > >>>
> > >>>> Christian.
> > >>>>
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-09 21:04   ` [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit Kenny Ho
       [not found]     ` <20190509210410.5471-5-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
@ 2019-05-15 21:26     ` Welty, Brian
       [not found]       ` <d81e8f55-9602-818e-0f9c-1d9d150133b1-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 80+ messages in thread
From: Welty, Brian @ 2019-05-15 21:26 UTC (permalink / raw)
  To: Kenny Ho, y2kenny, cgroups, dri-devel, amd-gfx, tj, sunnanyong,
	alexander.deucher, Christian König


On 5/9/2019 2:04 PM, Kenny Ho wrote:
> The drm resource being measured and limited here is the GEM buffer
> objects.  User applications allocate and free these buffers.  In
> addition, a process can allocate a buffer and share it with another
> process.  The consumer of a shared buffer can also outlive the
> allocator of the buffer.
> 
> For the purpose of cgroup accounting and limiting, ownership of the
> buffer is deemed to be the cgroup for which the allocating process
> belongs to.  There is one limit per drm device.
> 
> In order to prevent the buffer outliving the cgroup that owns it, a
> process is prevented from importing buffers that are not own by the
> process' cgroup or the ancestors of the process' cgroup.
> 
> For this resource, the control files are prefixed with drm.buffer.total.

Overall, this framework looks very good.

But is this a useful resource to track?   See my question down further
below in your drm_gem_private_object_init.


> 
> There are four control file types,
> stats (ro) - display current measured values for a resource
> max (rw) - limits for a resource
> default (ro, root cgroup only) - default values for a resource
> help (ro, root cgroup only) - help string for a resource
> 
> Each file is multi-lined with one entry/line per drm device.

Multi-line is correct for multiple devices, but I believe you need
to use a KEY to denote device for both your set and get routines.
I didn't see your set functions reading a key, or the get functions
printing the key in output.
cgroups-v2 conventions mention using KEY of major:minor, but I think
you can use drm_minor as key?

> 
> Usage examples:
> // set limit for card1 to 1GB
> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> 
> // set limit for card0 to 512MB
> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> 
> Change-Id: I4c249d06d45ec709d6481d4cbe87c5168545c5d0
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
>  drivers/gpu/drm/drm_gem.c                  |   7 +
>  drivers/gpu/drm/drm_prime.c                |   9 +
>  include/drm/drm_cgroup.h                   |  34 ++-
>  include/drm/drm_gem.h                      |  11 +
>  include/linux/cgroup_drm.h                 |   3 +
>  kernel/cgroup/drm.c                        | 280 +++++++++++++++++++++
>  7 files changed, 346 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 93b2c5a48a71..b4c078b7ad63 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -34,6 +34,7 @@
>  #include <drm/drmP.h>
>  #include <drm/amdgpu_drm.h>
>  #include <drm/drm_cache.h>
> +#include <drm/drm_cgroup.h>
>  #include "amdgpu.h"
>  #include "amdgpu_trace.h"
>  #include "amdgpu_amdkfd.h"
> @@ -446,6 +447,9 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>  	if (!amdgpu_bo_validate_size(adev, size, bp->domain))
>  		return -ENOMEM;
>  
> +	if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
> +		return -ENOMEM;
> +
>  	*bo_ptr = NULL;
>  
>  	acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 6a80db077dc6..cbd49bf34dcf 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -37,10 +37,12 @@
>  #include <linux/shmem_fs.h>
>  #include <linux/dma-buf.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/cgroup_drm.h>
>  #include <drm/drmP.h>
>  #include <drm/drm_vma_manager.h>
>  #include <drm/drm_gem.h>
>  #include <drm/drm_print.h>
> +#include <drm/drm_cgroup.h>
>  #include "drm_internal.h"
>  
>  /** @file drm_gem.c
> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>  	obj->handle_count = 0;
>  	obj->size = size;
>  	drm_vma_node_reset(&obj->vma_node);
> +
> +	obj->drmcgrp = get_drmcgrp(current);
> +	drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);


Why do the charging here?
There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
Is this really useful for an administrator to control?
Isn't the resource we want to control actually the physical backing store? 


>  }
>  EXPORT_SYMBOL(drm_gem_private_object_init);
>  
> @@ -804,6 +809,8 @@ drm_gem_object_release(struct drm_gem_object *obj)
>  	if (obj->filp)
>  		fput(obj->filp);
>  
> +	drmcgrp_unchg_bo_alloc(obj->drmcgrp, obj->dev, obj->size);
> +
>  	drm_gem_free_mmap_offset(obj);
>  }
>  EXPORT_SYMBOL(drm_gem_object_release);
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 231e3f6d5f41..faed5611a1c6 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -32,6 +32,7 @@
>  #include <drm/drm_prime.h>
>  #include <drm/drm_gem.h>
>  #include <drm/drmP.h>
> +#include <drm/drm_cgroup.h>
>  
>  #include "drm_internal.h"
>  
> @@ -794,6 +795,7 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>  {
>  	struct dma_buf *dma_buf;
>  	struct drm_gem_object *obj;
> +	struct drmcgrp *drmcgrp = get_drmcgrp(current);
>  	int ret;
>  
>  	dma_buf = dma_buf_get(prime_fd);
> @@ -818,6 +820,13 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>  		goto out_unlock;
>  	}
>  
> +	/* only allow bo from the same cgroup or its ancestor to be imported */
> +	if (drmcgrp != NULL &&
> +			!drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
> +		ret = -EACCES;
> +		goto out_unlock;
> +	}
> +
>  	if (obj->dma_buf) {
>  		WARN_ON(obj->dma_buf != dma_buf);
>  	} else {
> diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
> index ddb9eab64360..8711b7c5f7bf 100644
> --- a/include/drm/drm_cgroup.h
> +++ b/include/drm/drm_cgroup.h
> @@ -4,12 +4,20 @@
>  #ifndef __DRM_CGROUP_H__
>  #define __DRM_CGROUP_H__
>  
> +#include <linux/cgroup_drm.h>
> +
>  #ifdef CONFIG_CGROUP_DRM
>  
>  int drmcgrp_register_device(struct drm_device *device);
> -
>  int drmcgrp_unregister_device(struct drm_device *device);
> -
> +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> +		struct drmcgrp *relative);
> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size);
> +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size);
> +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> +		size_t size);
>  #else
>  static inline int drmcgrp_register_device(struct drm_device *device)
>  {
> @@ -20,5 +28,27 @@ static inline int drmcgrp_unregister_device(struct drm_device *device)
>  {
>  	return 0;
>  }
> +
> +static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> +		struct drmcgrp *relative)
> +{
> +	return false;
> +}
> +
> +static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
> +		struct drm_device *dev,	size_t size)
> +{
> +}
> +
> +static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
> +		struct drm_device *dev,	size_t size)
> +{
> +}
> +
> +static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
> +		struct drm_device *dev,	size_t size)
> +{
> +	return true;
> +}
>  #endif /* CONFIG_CGROUP_DRM */
>  #endif /* __DRM_CGROUP_H__ */
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index c95727425284..02854c674b5c 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -272,6 +272,17 @@ struct drm_gem_object {
>  	 *
>  	 */
>  	const struct drm_gem_object_funcs *funcs;
> +
> +	/**
> +	 * @drmcgrp:
> +	 *
> +	 * DRM cgroup this GEM object belongs to.
> +         *
> +         * This is used to track and limit the amount of GEM objects a user
> +         * can allocate.  Since GEM objects can be shared, this is also used
> +         * to ensure GEM objects are only shared within the same cgroup.
> +	 */
> +	struct drmcgrp *drmcgrp;
>  };
>  
>  /**
> diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> index d7ccf434ca6b..fe14ba7bb1cf 100644
> --- a/include/linux/cgroup_drm.h
> +++ b/include/linux/cgroup_drm.h
> @@ -15,6 +15,9 @@
>  
>  struct drmcgrp_device_resource {
>  	/* for per device stats */
> +	s64			bo_stats_total_allocated;
> +
> +	s64			bo_limits_total_allocated;
>  };
>  
>  struct drmcgrp {
> diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> index f9ef4bf042d8..bc3abff09113 100644
> --- a/kernel/cgroup/drm.c
> +++ b/kernel/cgroup/drm.c
> @@ -15,6 +15,22 @@ static DEFINE_MUTEX(drmcgrp_mutex);
>  struct drmcgrp_device {
>  	struct drm_device	*dev;
>  	struct mutex		mutex;
> +
> +	s64			bo_limits_total_allocated_default;
> +};
> +
> +#define DRMCG_CTF_PRIV_SIZE 3
> +#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
> +
> +enum drmcgrp_res_type {
> +	DRMCGRP_TYPE_BO_TOTAL,
> +};
> +
> +enum drmcgrp_file_type {
> +	DRMCGRP_FTYPE_STATS,
> +	DRMCGRP_FTYPE_MAX,
> +	DRMCGRP_FTYPE_DEFAULT,
> +	DRMCGRP_FTYPE_HELP,
>  };
>  
>  /* indexed by drm_minor for access speed */
> @@ -53,6 +69,10 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int i)
>  	}
>  
>  	/* set defaults here */
> +	if (known_drmcgrp_devs[i] != NULL) {
> +		ddr->bo_limits_total_allocated =
> +		  known_drmcgrp_devs[i]->bo_limits_total_allocated_default;
> +	}
>  
>  	return 0;
>  }
> @@ -99,7 +119,187 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
>  	return &drmcgrp->css;
>  }
>  
> +static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddr == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddr == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddev == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddev->bo_limits_total_allocated_default);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_help(int cardNum, struct seq_file *sf,
> +		enum drmcgrp_res_type type)
> +{
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf,
> +		"Total amount of buffer allocation in bytes for card%d\n",
> +		cardNum);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +int drmcgrp_bo_show(struct seq_file *sf, void *v)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> +	struct drmcgrp_device_resource *ddr = NULL;
> +	enum drmcgrp_file_type f_type = seq_cft(sf)->
> +		private & DRMCG_CTF_PRIV_MASK;
> +	enum drmcgrp_res_type type = seq_cft(sf)->
> +		private >> DRMCG_CTF_PRIV_SIZE;
> +	struct drmcgrp_device *ddev;
> +	int i;
> +
> +	for (i = 0; i <= max_minor; i++) {
> +		ddr = drmcgrp->dev_resources[i];
> +		ddev = known_drmcgrp_devs[i];
> +
> +		switch (f_type) {
> +		case DRMCGRP_FTYPE_STATS:
> +			drmcgrp_print_stats(ddr, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_MAX:
> +			drmcgrp_print_limits(ddr, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_DEFAULT:
> +			drmcgrp_print_default(ddev, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_HELP:
> +			drmcgrp_print_help(i, sf, type);
> +			break;
> +		default:
> +			seq_puts(sf, "\n");
> +			break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
> +		size_t nbytes, loff_t off)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
> +	enum drmcgrp_res_type type = of_cft(of)->private >> DRMCG_CTF_PRIV_SIZE;
> +	char *cft_name = of_cft(of)->name;
> +	char *limits = strstrip(buf);
> +	struct drmcgrp_device_resource *ddr;
> +	char *sval;
> +	s64 val;
> +	int i = 0;
> +	int rc;
> +
> +	while (i <= max_minor && limits != NULL) {
> +		sval =  strsep(&limits, "\n");
> +		rc = kstrtoll(sval, 0, &val);

Input should be "KEY VALUE", so KEY will determine device to apply this to.
Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.

parse_resources() in rdma controller is example for both of above.

> +		if (rc) {
> +			pr_err("drmcgrp: %s: minor %d, err %d. ",
> +				cft_name, i, rc);
> +			pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +			pr_cont("\n");
> +		} else {
> +			ddr = drmcgrp->dev_resources[i];
> +			switch (type) {
> +			case DRMCGRP_TYPE_BO_TOTAL:
> +                                if (val < 0) continue;
> +				ddr->bo_limits_total_allocated = val;
> +				break;
> +			default:
> +				break;
> +			}
> +		}
> +
> +		i++;
> +	}
> +
> +	if (i <= max_minor) {
> +		pr_err("drmcgrp: %s: less entries than # of drm devices. ",
> +				cft_name);
> +		pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +		pr_cont("\n");
> +	}
> +
> +	return nbytes;
> +}
> +
>  struct cftype files[] = {
> +	{
> +		.name = "buffer.total.stats",
> +		.seq_show = drmcgrp_bo_show,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_STATS,
> +	},
> +	{
> +		.name = "buffer.total.default",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_DEFAULT,
> +	},
> +	{
> +		.name = "buffer.total.help",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_HELP,
> +	},
> +	{
> +		.name = "buffer.total.max",
> +		.write = drmcgrp_bo_limit_write,
> +		.seq_show = drmcgrp_bo_show,
> +		.private = (DRMCGRP_TYPE_BO_TOTAL << DRMCG_CTF_PRIV_SIZE) |
> +			DRMCGRP_FTYPE_MAX,
> +	},
>  	{ }	/* terminate */
>  };
>  
> @@ -122,6 +322,8 @@ int drmcgrp_register_device(struct drm_device *dev)
>  		return -ENOMEM;
>  
>  	ddev->dev = dev;
> +	ddev->bo_limits_total_allocated_default = S64_MAX;
> +
>  	mutex_init(&ddev->mutex);
>  
>  	mutex_lock(&drmcgrp_mutex);
> @@ -156,3 +358,81 @@ int drmcgrp_unregister_device(struct drm_device *dev)
>  	return 0;
>  }
>  EXPORT_SYMBOL(drmcgrp_unregister_device);
> +
> +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
> +{
> +	for (; self != NULL; self = parent_drmcgrp(self))
> +		if (self == relative)
> +			return true;
> +
> +	return false;
> +}
> +EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
> +
> +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp *drmcgrp = get_drmcgrp(task);
> +	struct drmcgrp_device_resource *ddr;
> +	struct drmcgrp_device_resource *d;
> +	int devIdx = dev->primary->index;
> +	bool result = true;
> +	s64 delta = 0;
> +
> +	if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
> +		return true;
> +
> +	ddr = drmcgrp->dev_resources[devIdx];
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
> +		d = drmcgrp->dev_resources[devIdx];
> +		delta = d->bo_limits_total_allocated -
> +				d->bo_stats_total_allocated;
> +
> +		if (delta <= 0 || size > delta) {
> +			result = false;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +
> +	return result;
> +}
> +EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
> +
> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size)

Shouldn't this return an error and be implemented with same semantics as the
try_charge() functions of other controllers?
Below will allow stats_total_allocated to overrun limits_total_allocated.


> +{
> +	struct drmcgrp_device_resource *ddr;
> +	int devIdx = dev->primary->index;
> +
> +	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> +		return;
> +
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
> +		ddr = drmcgrp->dev_resources[devIdx];
> +
> +		ddr->bo_stats_total_allocated += (s64)size;
> +	}
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +}
> +EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
> +
> +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp_device_resource *ddr;
> +	int devIdx = dev->primary->index;
> +
> +	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> +		return;
> +
> +	ddr = drmcgrp->dev_resources[devIdx];
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
> +		drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
> +			-= (s64)size;
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +}
> +EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]       ` <d81e8f55-9602-818e-0f9c-1d9d150133b1-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2019-05-16  2:29         ` Kenny Ho
       [not found]           ` <CAOWid-ftUrVVWPu9KuS8xpWKNQT6_FtxB8gEyEAn9nLD6qxb5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Kenny Ho @ 2019-05-16  2:29 UTC (permalink / raw)
  To: Welty, Brian
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Kenny Ho,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, Christian König

On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
> On 5/9/2019 2:04 PM, Kenny Ho wrote:
> > There are four control file types,
> > stats (ro) - display current measured values for a resource
> > max (rw) - limits for a resource
> > default (ro, root cgroup only) - default values for a resource
> > help (ro, root cgroup only) - help string for a resource
> >
> > Each file is multi-lined with one entry/line per drm device.
>
> Multi-line is correct for multiple devices, but I believe you need
> to use a KEY to denote device for both your set and get routines.
> I didn't see your set functions reading a key, or the get functions
> printing the key in output.
> cgroups-v2 conventions mention using KEY of major:minor, but I think
> you can use drm_minor as key?
Given this controller is specific to the drm kernel subsystem which
uses minor to identify drm device, I don't see a need to complicate
the interfaces more by having major and a key.  As you can see in the
examples below, the drm device minor corresponds to the line number.
I am not sure how strict cgroup upstream is about the convention but I
am hoping there are flexibility here to allow for what I have
implemented.  There are a couple of other things I have done that is
not described in the convention: 1) inclusion of read-only *.help file
at the root cgroup, 2) use read-only (which I can potentially make rw)
*.default file instead of having a default entries (since the default
can be different for different devices) inside the control files (this
way, the resetting of cgroup values for all the drm devices, can be
done by a simple 'cp'.)

> > Usage examples:
> > // set limit for card1 to 1GB
> > sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> > // set limit for card0 to 512MB
> > sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max


> >  /** @file drm_gem.c
> > @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
> >       obj->handle_count = 0;
> >       obj->size = size;
> >       drm_vma_node_reset(&obj->vma_node);
> > +
> > +     obj->drmcgrp = get_drmcgrp(current);
> > +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>
> Why do the charging here?
> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
> Is this really useful for an administrator to control?
> Isn't the resource we want to control actually the physical backing store?
That's correct.  This is just the first level of control since the
backing store can be backed by different type of memory.  I am in the
process of adding at least two more resources.  Stay tuned.  I am
doing the charge here to enforce the idea of "creator is deemed owner"
at a place where the code is shared by all (the init function.)

> > +     while (i <= max_minor && limits != NULL) {
> > +             sval =  strsep(&limits, "\n");
> > +             rc = kstrtoll(sval, 0, &val);
>
> Input should be "KEY VALUE", so KEY will determine device to apply this to.
> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
>
> parse_resources() in rdma controller is example for both of above.
Please see my previous reply for the rationale of my hope to not need
a key.  I can certainly add handling of "max" and "default".


> > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > +             size_t size)
>
> Shouldn't this return an error and be implemented with same semantics as the
> try_charge() functions of other controllers?
> Below will allow stats_total_allocated to overrun limits_total_allocated.
This is because I am charging the buffer at the init of the buffer
which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
and placed earlier and nearer other condition where gem object
allocation may fail.  In other words, there are multiple possibilities
for which gem allocation may fail (cgroup limit being one of them) and
satisfying cgroup limit does not mean a charge is needed.  I can
certainly combine the two functions to have an additional try_charge
semantic as well if that is really needed.

Regards,
Kenny
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]           ` <CAOWid-ftUrVVWPu9KuS8xpWKNQT6_FtxB8gEyEAn9nLD6qxb5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-16  7:16             ` Koenig, Christian
  2019-05-16  7:25               ` Christian König
  2019-05-16 14:10             ` Tejun Heo
  1 sibling, 1 reply; 80+ messages in thread
From: Koenig, Christian @ 2019-05-16  7:16 UTC (permalink / raw)
  To: Kenny Ho, Welty, Brian
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

Am 16.05.19 um 04:29 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>> There are four control file types,
>>> stats (ro) - display current measured values for a resource
>>> max (rw) - limits for a resource
>>> default (ro, root cgroup only) - default values for a resource
>>> help (ro, root cgroup only) - help string for a resource
>>>
>>> Each file is multi-lined with one entry/line per drm device.
>> Multi-line is correct for multiple devices, but I believe you need
>> to use a KEY to denote device for both your set and get routines.
>> I didn't see your set functions reading a key, or the get functions
>> printing the key in output.
>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>> you can use drm_minor as key?
> Given this controller is specific to the drm kernel subsystem which
> uses minor to identify drm device,

Wait a second, using the DRM minor is a good idea in the first place.

I have a test system with a Vega10 and a Vega20. Which device gets which 
minor is not stable, but rather defined by the scan order of the PCIe bus.

Normally the scan order is always the same, but adding or removing 
devices or delaying things just a little bit during init is enough to 
change this.

We need something like the Linux sysfs location or similar to have a 
stable implementation.

Regards,
Christian.

>   I don't see a need to complicate
> the interfaces more by having major and a key.  As you can see in the
> examples below, the drm device minor corresponds to the line number.
> I am not sure how strict cgroup upstream is about the convention but I
> am hoping there are flexibility here to allow for what I have
> implemented.  There are a couple of other things I have done that is
> not described in the convention: 1) inclusion of read-only *.help file
> at the root cgroup, 2) use read-only (which I can potentially make rw)
> *.default file instead of having a default entries (since the default
> can be different for different devices) inside the control files (this
> way, the resetting of cgroup values for all the drm devices, can be
> done by a simple 'cp'.)
>
>>> Usage examples:
>>> // set limit for card1 to 1GB
>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>
>>> // set limit for card0 to 512MB
>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>
>>>   /** @file drm_gem.c
>>> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>>>        obj->handle_count = 0;
>>>        obj->size = size;
>>>        drm_vma_node_reset(&obj->vma_node);
>>> +
>>> +     obj->drmcgrp = get_drmcgrp(current);
>>> +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>> Why do the charging here?
>> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
>> Is this really useful for an administrator to control?
>> Isn't the resource we want to control actually the physical backing store?
> That's correct.  This is just the first level of control since the
> backing store can be backed by different type of memory.  I am in the
> process of adding at least two more resources.  Stay tuned.  I am
> doing the charge here to enforce the idea of "creator is deemed owner"
> at a place where the code is shared by all (the init function.)
>
>>> +     while (i <= max_minor && limits != NULL) {
>>> +             sval =  strsep(&limits, "\n");
>>> +             rc = kstrtoll(sval, 0, &val);
>> Input should be "KEY VALUE", so KEY will determine device to apply this to.
>> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
>>
>> parse_resources() in rdma controller is example for both of above.
> Please see my previous reply for the rationale of my hope to not need
> a key.  I can certainly add handling of "max" and "default".
>
>
>>> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
>>> +             size_t size)
>> Shouldn't this return an error and be implemented with same semantics as the
>> try_charge() functions of other controllers?
>> Below will allow stats_total_allocated to overrun limits_total_allocated.
> This is because I am charging the buffer at the init of the buffer
> which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
> and placed earlier and nearer other condition where gem object
> allocation may fail.  In other words, there are multiple possibilities
> for which gem allocation may fail (cgroup limit being one of them) and
> satisfying cgroup limit does not mean a charge is needed.  I can
> certainly combine the two functions to have an additional try_charge
> semantic as well if that is really needed.
>
> Regards,
> Kenny

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-16  7:16             ` Koenig, Christian
@ 2019-05-16  7:25               ` Christian König
       [not found]                 ` <6e124f5e-f83f-5ca1-4616-92538f202653-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 80+ messages in thread
From: Christian König @ 2019-05-16  7:25 UTC (permalink / raw)
  To: Koenig, Christian, Kenny Ho, Welty, Brian
  Cc: sunnanyong, Ho, Kenny, dri-devel, Tejun Heo, amd-gfx, Deucher,
	Alexander, cgroups

Am 16.05.19 um 09:16 schrieb Koenig, Christian:
> Am 16.05.19 um 04:29 schrieb Kenny Ho:
>> [CAUTION: External Email]
>>
>> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
>>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>>> There are four control file types,
>>>> stats (ro) - display current measured values for a resource
>>>> max (rw) - limits for a resource
>>>> default (ro, root cgroup only) - default values for a resource
>>>> help (ro, root cgroup only) - help string for a resource
>>>>
>>>> Each file is multi-lined with one entry/line per drm device.
>>> Multi-line is correct for multiple devices, but I believe you need
>>> to use a KEY to denote device for both your set and get routines.
>>> I didn't see your set functions reading a key, or the get functions
>>> printing the key in output.
>>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>>> you can use drm_minor as key?
>> Given this controller is specific to the drm kernel subsystem which
>> uses minor to identify drm device,
> Wait a second, using the DRM minor is a good idea in the first place.

Well that should have read "is not a good idea"..

Christian.

>
> I have a test system with a Vega10 and a Vega20. Which device gets which
> minor is not stable, but rather defined by the scan order of the PCIe bus.
>
> Normally the scan order is always the same, but adding or removing
> devices or delaying things just a little bit during init is enough to
> change this.
>
> We need something like the Linux sysfs location or similar to have a
> stable implementation.
>
> Regards,
> Christian.
>
>>    I don't see a need to complicate
>> the interfaces more by having major and a key.  As you can see in the
>> examples below, the drm device minor corresponds to the line number.
>> I am not sure how strict cgroup upstream is about the convention but I
>> am hoping there are flexibility here to allow for what I have
>> implemented.  There are a couple of other things I have done that is
>> not described in the convention: 1) inclusion of read-only *.help file
>> at the root cgroup, 2) use read-only (which I can potentially make rw)
>> *.default file instead of having a default entries (since the default
>> can be different for different devices) inside the control files (this
>> way, the resetting of cgroup values for all the drm devices, can be
>> done by a simple 'cp'.)
>>
>>>> Usage examples:
>>>> // set limit for card1 to 1GB
>>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>>
>>>> // set limit for card0 to 512MB
>>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>>    /** @file drm_gem.c
>>>> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>>>>         obj->handle_count = 0;
>>>>         obj->size = size;
>>>>         drm_vma_node_reset(&obj->vma_node);
>>>> +
>>>> +     obj->drmcgrp = get_drmcgrp(current);
>>>> +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>>> Why do the charging here?
>>> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
>>> Is this really useful for an administrator to control?
>>> Isn't the resource we want to control actually the physical backing store?
>> That's correct.  This is just the first level of control since the
>> backing store can be backed by different type of memory.  I am in the
>> process of adding at least two more resources.  Stay tuned.  I am
>> doing the charge here to enforce the idea of "creator is deemed owner"
>> at a place where the code is shared by all (the init function.)
>>
>>>> +     while (i <= max_minor && limits != NULL) {
>>>> +             sval =  strsep(&limits, "\n");
>>>> +             rc = kstrtoll(sval, 0, &val);
>>> Input should be "KEY VALUE", so KEY will determine device to apply this to.
>>> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
>>>
>>> parse_resources() in rdma controller is example for both of above.
>> Please see my previous reply for the rationale of my hope to not need
>> a key.  I can certainly add handling of "max" and "default".
>>
>>
>>>> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
>>>> +             size_t size)
>>> Shouldn't this return an error and be implemented with same semantics as the
>>> try_charge() functions of other controllers?
>>> Below will allow stats_total_allocated to overrun limits_total_allocated.
>> This is because I am charging the buffer at the init of the buffer
>> which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
>> and placed earlier and nearer other condition where gem object
>> allocation may fail.  In other words, there are multiple possibilities
>> for which gem allocation may fail (cgroup limit being one of them) and
>> satisfying cgroup limit does not mean a charge is needed.  I can
>> certainly combine the two functions to have an additional try_charge
>> semantic as well if that is really needed.
>>
>> Regards,
>> Kenny
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]                 ` <6e124f5e-f83f-5ca1-4616-92538f202653-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-05-16 12:28                     ` Daniel Vetter
  2019-05-16 14:03                     ` Kenny Ho
  1 sibling, 0 replies; 80+ messages in thread
From: Daniel Vetter @ 2019-05-16 12:28 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Welty, Brian,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Kenny Ho, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu, May 16, 2019 at 09:25:31AM +0200, Christian König wrote:
> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
> > Am 16.05.19 um 04:29 schrieb Kenny Ho:
> > > [CAUTION: External Email]
> > > 
> > > On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
> > > > On 5/9/2019 2:04 PM, Kenny Ho wrote:
> > > > > There are four control file types,
> > > > > stats (ro) - display current measured values for a resource
> > > > > max (rw) - limits for a resource
> > > > > default (ro, root cgroup only) - default values for a resource
> > > > > help (ro, root cgroup only) - help string for a resource
> > > > > 
> > > > > Each file is multi-lined with one entry/line per drm device.
> > > > Multi-line is correct for multiple devices, but I believe you need
> > > > to use a KEY to denote device for both your set and get routines.
> > > > I didn't see your set functions reading a key, or the get functions
> > > > printing the key in output.
> > > > cgroups-v2 conventions mention using KEY of major:minor, but I think
> > > > you can use drm_minor as key?
> > > Given this controller is specific to the drm kernel subsystem which
> > > uses minor to identify drm device,
> > Wait a second, using the DRM minor is a good idea in the first place.
> 
> Well that should have read "is not a good idea"..

What else should we use?
> 
> Christian.
> 
> > 
> > I have a test system with a Vega10 and a Vega20. Which device gets which
> > minor is not stable, but rather defined by the scan order of the PCIe bus.
> > 
> > Normally the scan order is always the same, but adding or removing
> > devices or delaying things just a little bit during init is enough to
> > change this.
> > 
> > We need something like the Linux sysfs location or similar to have a
> > stable implementation.

You can go from sysfs location to drm class directory (in sysfs) and back.
That means if you care you need to walk sysfs yourself a bit, but using
the drm minor isn't a blocker itself.

One downside with the drm minor is that it's pretty good nonsense once you
have more than 64 gpus though, due to how we space render and legacy nodes
in the minor ids :-)
-Daniel
> > 
> > Regards,
> > Christian.
> > 
> > >    I don't see a need to complicate
> > > the interfaces more by having major and a key.  As you can see in the
> > > examples below, the drm device minor corresponds to the line number.
> > > I am not sure how strict cgroup upstream is about the convention but I
> > > am hoping there are flexibility here to allow for what I have
> > > implemented.  There are a couple of other things I have done that is
> > > not described in the convention: 1) inclusion of read-only *.help file
> > > at the root cgroup, 2) use read-only (which I can potentially make rw)
> > > *.default file instead of having a default entries (since the default
> > > can be different for different devices) inside the control files (this
> > > way, the resetting of cgroup values for all the drm devices, can be
> > > done by a simple 'cp'.)
> > > 
> > > > > Usage examples:
> > > > > // set limit for card1 to 1GB
> > > > > sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> > > > > 
> > > > > // set limit for card0 to 512MB
> > > > > sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> > > > >    /** @file drm_gem.c
> > > > > @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
> > > > >         obj->handle_count = 0;
> > > > >         obj->size = size;
> > > > >         drm_vma_node_reset(&obj->vma_node);
> > > > > +
> > > > > +     obj->drmcgrp = get_drmcgrp(current);
> > > > > +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
> > > > Why do the charging here?
> > > > There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
> > > > Is this really useful for an administrator to control?
> > > > Isn't the resource we want to control actually the physical backing store?
> > > That's correct.  This is just the first level of control since the
> > > backing store can be backed by different type of memory.  I am in the
> > > process of adding at least two more resources.  Stay tuned.  I am
> > > doing the charge here to enforce the idea of "creator is deemed owner"
> > > at a place where the code is shared by all (the init function.)
> > > 
> > > > > +     while (i <= max_minor && limits != NULL) {
> > > > > +             sval =  strsep(&limits, "\n");
> > > > > +             rc = kstrtoll(sval, 0, &val);
> > > > Input should be "KEY VALUE", so KEY will determine device to apply this to.
> > > > Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
> > > > 
> > > > parse_resources() in rdma controller is example for both of above.
> > > Please see my previous reply for the rationale of my hope to not need
> > > a key.  I can certainly add handling of "max" and "default".
> > > 
> > > 
> > > > > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > > > > +             size_t size)
> > > > Shouldn't this return an error and be implemented with same semantics as the
> > > > try_charge() functions of other controllers?
> > > > Below will allow stats_total_allocated to overrun limits_total_allocated.
> > > This is because I am charging the buffer at the init of the buffer
> > > which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
> > > and placed earlier and nearer other condition where gem object
> > > allocation may fail.  In other words, there are multiple possibilities
> > > for which gem allocation may fail (cgroup limit being one of them) and
> > > satisfying cgroup limit does not mean a charge is needed.  I can
> > > certainly combine the two functions to have an additional try_charge
> > > semantic as well if that is really needed.
> > > 
> > > Regards,
> > > Kenny
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-16 12:28                     ` Daniel Vetter
  0 siblings, 0 replies; 80+ messages in thread
From: Daniel Vetter @ 2019-05-16 12:28 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Welty, Brian,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Kenny Ho, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu, May 16, 2019 at 09:25:31AM +0200, Christian König wrote:
> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
> > Am 16.05.19 um 04:29 schrieb Kenny Ho:
> > > [CAUTION: External Email]
> > > 
> > > On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
> > > > On 5/9/2019 2:04 PM, Kenny Ho wrote:
> > > > > There are four control file types,
> > > > > stats (ro) - display current measured values for a resource
> > > > > max (rw) - limits for a resource
> > > > > default (ro, root cgroup only) - default values for a resource
> > > > > help (ro, root cgroup only) - help string for a resource
> > > > > 
> > > > > Each file is multi-lined with one entry/line per drm device.
> > > > Multi-line is correct for multiple devices, but I believe you need
> > > > to use a KEY to denote device for both your set and get routines.
> > > > I didn't see your set functions reading a key, or the get functions
> > > > printing the key in output.
> > > > cgroups-v2 conventions mention using KEY of major:minor, but I think
> > > > you can use drm_minor as key?
> > > Given this controller is specific to the drm kernel subsystem which
> > > uses minor to identify drm device,
> > Wait a second, using the DRM minor is a good idea in the first place.
> 
> Well that should have read "is not a good idea"..

What else should we use?
> 
> Christian.
> 
> > 
> > I have a test system with a Vega10 and a Vega20. Which device gets which
> > minor is not stable, but rather defined by the scan order of the PCIe bus.
> > 
> > Normally the scan order is always the same, but adding or removing
> > devices or delaying things just a little bit during init is enough to
> > change this.
> > 
> > We need something like the Linux sysfs location or similar to have a
> > stable implementation.

You can go from sysfs location to drm class directory (in sysfs) and back.
That means if you care you need to walk sysfs yourself a bit, but using
the drm minor isn't a blocker itself.

One downside with the drm minor is that it's pretty good nonsense once you
have more than 64 gpus though, due to how we space render and legacy nodes
in the minor ids :-)
-Daniel
> > 
> > Regards,
> > Christian.
> > 
> > >    I don't see a need to complicate
> > > the interfaces more by having major and a key.  As you can see in the
> > > examples below, the drm device minor corresponds to the line number.
> > > I am not sure how strict cgroup upstream is about the convention but I
> > > am hoping there are flexibility here to allow for what I have
> > > implemented.  There are a couple of other things I have done that is
> > > not described in the convention: 1) inclusion of read-only *.help file
> > > at the root cgroup, 2) use read-only (which I can potentially make rw)
> > > *.default file instead of having a default entries (since the default
> > > can be different for different devices) inside the control files (this
> > > way, the resetting of cgroup values for all the drm devices, can be
> > > done by a simple 'cp'.)
> > > 
> > > > > Usage examples:
> > > > > // set limit for card1 to 1GB
> > > > > sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> > > > > 
> > > > > // set limit for card0 to 512MB
> > > > > sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> > > > >    /** @file drm_gem.c
> > > > > @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
> > > > >         obj->handle_count = 0;
> > > > >         obj->size = size;
> > > > >         drm_vma_node_reset(&obj->vma_node);
> > > > > +
> > > > > +     obj->drmcgrp = get_drmcgrp(current);
> > > > > +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
> > > > Why do the charging here?
> > > > There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
> > > > Is this really useful for an administrator to control?
> > > > Isn't the resource we want to control actually the physical backing store?
> > > That's correct.  This is just the first level of control since the
> > > backing store can be backed by different type of memory.  I am in the
> > > process of adding at least two more resources.  Stay tuned.  I am
> > > doing the charge here to enforce the idea of "creator is deemed owner"
> > > at a place where the code is shared by all (the init function.)
> > > 
> > > > > +     while (i <= max_minor && limits != NULL) {
> > > > > +             sval =  strsep(&limits, "\n");
> > > > > +             rc = kstrtoll(sval, 0, &val);
> > > > Input should be "KEY VALUE", so KEY will determine device to apply this to.
> > > > Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
> > > > 
> > > > parse_resources() in rdma controller is example for both of above.
> > > Please see my previous reply for the rationale of my hope to not need
> > > a key.  I can certainly add handling of "max" and "default".
> > > 
> > > 
> > > > > +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> > > > > +             size_t size)
> > > > Shouldn't this return an error and be implemented with same semantics as the
> > > > try_charge() functions of other controllers?
> > > > Below will allow stats_total_allocated to overrun limits_total_allocated.
> > > This is because I am charging the buffer at the init of the buffer
> > > which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
> > > and placed earlier and nearer other condition where gem object
> > > allocation may fail.  In other words, there are multiple possibilities
> > > for which gem allocation may fail (cgroup limit being one of them) and
> > > satisfying cgroup limit does not mean a charge is needed.  I can
> > > certainly combine the two functions to have an additional try_charge
> > > semantic as well if that is really needed.
> > > 
> > > Regards,
> > > Kenny
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]                 ` <6e124f5e-f83f-5ca1-4616-92538f202653-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-05-16 14:03                     ` Kenny Ho
  2019-05-16 14:03                     ` Kenny Ho
  1 sibling, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-16 14:03 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Welty, Brian,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu, May 16, 2019 at 3:25 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
> > Am 16.05.19 um 04:29 schrieb Kenny Ho:
> >> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
> >>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
> >>>> Each file is multi-lined with one entry/line per drm device.
> >>> Multi-line is correct for multiple devices, but I believe you need
> >>> to use a KEY to denote device for both your set and get routines.
> >>> I didn't see your set functions reading a key, or the get functions
> >>> printing the key in output.
> >>> cgroups-v2 conventions mention using KEY of major:minor, but I think
> >>> you can use drm_minor as key?
> >> Given this controller is specific to the drm kernel subsystem which
> >> uses minor to identify drm device,
> > Wait a second, using the DRM minor is a good idea in the first place.
> Well that should have read "is not a good idea"..
>
> I have a test system with a Vega10 and a Vega20. Which device gets which
> minor is not stable, but rather defined by the scan order of the PCIe bus.
>
> Normally the scan order is always the same, but adding or removing
> devices or delaying things just a little bit during init is enough to
> change this.
>
> We need something like the Linux sysfs location or similar to have a
> stable implementation.

I get that, which is why I don't use minor to identify cards in user
space apps I wrote:
https://github.com/RadeonOpenCompute/k8s-device-plugin/blob/c2659c9d1d0713cad36fb5256681125121e6e32f/internal/pkg/amdgpu/amdgpu.go#L85

But within the kernel, I think my use of minor is consistent with the
rest of the drm subsystem.  I hope I don't need to reform the way the
drm subsystem use minor in order to introduce a cgroup controller.

Regards,
Kenny
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-16 14:03                     ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-16 14:03 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Welty, Brian,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu, May 16, 2019 at 3:25 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
> > Am 16.05.19 um 04:29 schrieb Kenny Ho:
> >> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
> >>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
> >>>> Each file is multi-lined with one entry/line per drm device.
> >>> Multi-line is correct for multiple devices, but I believe you need
> >>> to use a KEY to denote device for both your set and get routines.
> >>> I didn't see your set functions reading a key, or the get functions
> >>> printing the key in output.
> >>> cgroups-v2 conventions mention using KEY of major:minor, but I think
> >>> you can use drm_minor as key?
> >> Given this controller is specific to the drm kernel subsystem which
> >> uses minor to identify drm device,
> > Wait a second, using the DRM minor is a good idea in the first place.
> Well that should have read "is not a good idea"..
>
> I have a test system with a Vega10 and a Vega20. Which device gets which
> minor is not stable, but rather defined by the scan order of the PCIe bus.
>
> Normally the scan order is always the same, but adding or removing
> devices or delaying things just a little bit during init is enough to
> change this.
>
> We need something like the Linux sysfs location or similar to have a
> stable implementation.

I get that, which is why I don't use minor to identify cards in user
space apps I wrote:
https://github.com/RadeonOpenCompute/k8s-device-plugin/blob/c2659c9d1d0713cad36fb5256681125121e6e32f/internal/pkg/amdgpu/amdgpu.go#L85

But within the kernel, I think my use of minor is consistent with the
rest of the drm subsystem.  I hope I don't need to reform the way the
drm subsystem use minor in order to introduce a cgroup controller.

Regards,
Kenny
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-16 12:28                     ` Daniel Vetter
@ 2019-05-16 14:08                       ` Koenig, Christian
  -1 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-16 14:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: sunnanyong, Ho, Kenny, Welty, Brian, dri-devel, Deucher,
	Alexander, Kenny Ho, amd-gfx, Tejun Heo, cgroups

Am 16.05.19 um 14:28 schrieb Daniel Vetter:
> [CAUTION: External Email]
>
> On Thu, May 16, 2019 at 09:25:31AM +0200, Christian König wrote:
>> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
>>> Am 16.05.19 um 04:29 schrieb Kenny Ho:
>>>> [CAUTION: External Email]
>>>>
>>>> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
>>>>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>>>>> There are four control file types,
>>>>>> stats (ro) - display current measured values for a resource
>>>>>> max (rw) - limits for a resource
>>>>>> default (ro, root cgroup only) - default values for a resource
>>>>>> help (ro, root cgroup only) - help string for a resource
>>>>>>
>>>>>> Each file is multi-lined with one entry/line per drm device.
>>>>> Multi-line is correct for multiple devices, but I believe you need
>>>>> to use a KEY to denote device for both your set and get routines.
>>>>> I didn't see your set functions reading a key, or the get functions
>>>>> printing the key in output.
>>>>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>>>>> you can use drm_minor as key?
>>>> Given this controller is specific to the drm kernel subsystem which
>>>> uses minor to identify drm device,
>>> Wait a second, using the DRM minor is a good idea in the first place.
>> Well that should have read "is not a good idea"..
> What else should we use?

Well what does for example udev uses to identify a device?

>> Christian.
>>
>>> I have a test system with a Vega10 and a Vega20. Which device gets which
>>> minor is not stable, but rather defined by the scan order of the PCIe bus.
>>>
>>> Normally the scan order is always the same, but adding or removing
>>> devices or delaying things just a little bit during init is enough to
>>> change this.
>>>
>>> We need something like the Linux sysfs location or similar to have a
>>> stable implementation.
> You can go from sysfs location to drm class directory (in sysfs) and back.
> That means if you care you need to walk sysfs yourself a bit, but using
> the drm minor isn't a blocker itself.

Yeah, agreed that userspace could do this. But I think if there is an of 
hand alternative we should use this instead.

> One downside with the drm minor is that it's pretty good nonsense once you
> have more than 64 gpus though, due to how we space render and legacy nodes
> in the minor ids :-)

Ok, another good reason to at least not use the minor=linenum approach.

Christian.

> -Daniel
>>> Regards,
>>> Christian.
>>>
>>>>     I don't see a need to complicate
>>>> the interfaces more by having major and a key.  As you can see in the
>>>> examples below, the drm device minor corresponds to the line number.
>>>> I am not sure how strict cgroup upstream is about the convention but I
>>>> am hoping there are flexibility here to allow for what I have
>>>> implemented.  There are a couple of other things I have done that is
>>>> not described in the convention: 1) inclusion of read-only *.help file
>>>> at the root cgroup, 2) use read-only (which I can potentially make rw)
>>>> *.default file instead of having a default entries (since the default
>>>> can be different for different devices) inside the control files (this
>>>> way, the resetting of cgroup values for all the drm devices, can be
>>>> done by a simple 'cp'.)
>>>>
>>>>>> Usage examples:
>>>>>> // set limit for card1 to 1GB
>>>>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>>>>
>>>>>> // set limit for card0 to 512MB
>>>>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>>>>     /** @file drm_gem.c
>>>>>> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>>>>>>          obj->handle_count = 0;
>>>>>>          obj->size = size;
>>>>>>          drm_vma_node_reset(&obj->vma_node);
>>>>>> +
>>>>>> +     obj->drmcgrp = get_drmcgrp(current);
>>>>>> +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>>>>> Why do the charging here?
>>>>> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
>>>>> Is this really useful for an administrator to control?
>>>>> Isn't the resource we want to control actually the physical backing store?
>>>> That's correct.  This is just the first level of control since the
>>>> backing store can be backed by different type of memory.  I am in the
>>>> process of adding at least two more resources.  Stay tuned.  I am
>>>> doing the charge here to enforce the idea of "creator is deemed owner"
>>>> at a place where the code is shared by all (the init function.)
>>>>
>>>>>> +     while (i <= max_minor && limits != NULL) {
>>>>>> +             sval =  strsep(&limits, "\n");
>>>>>> +             rc = kstrtoll(sval, 0, &val);
>>>>> Input should be "KEY VALUE", so KEY will determine device to apply this to.
>>>>> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
>>>>>
>>>>> parse_resources() in rdma controller is example for both of above.
>>>> Please see my previous reply for the rationale of my hope to not need
>>>> a key.  I can certainly add handling of "max" and "default".
>>>>
>>>>
>>>>>> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
>>>>>> +             size_t size)
>>>>> Shouldn't this return an error and be implemented with same semantics as the
>>>>> try_charge() functions of other controllers?
>>>>> Below will allow stats_total_allocated to overrun limits_total_allocated.
>>>> This is because I am charging the buffer at the init of the buffer
>>>> which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
>>>> and placed earlier and nearer other condition where gem object
>>>> allocation may fail.  In other words, there are multiple possibilities
>>>> for which gem allocation may fail (cgroup limit being one of them) and
>>>> satisfying cgroup limit does not mean a charge is needed.  I can
>>>> certainly combine the two functions to have an additional try_charge
>>>> semantic as well if that is really needed.
>>>>
>>>> Regards,
>>>> Kenny
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-16 14:08                       ` Koenig, Christian
  0 siblings, 0 replies; 80+ messages in thread
From: Koenig, Christian @ 2019-05-16 14:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: sunnanyong, Ho, Kenny, Welty, Brian, dri-devel, Deucher,
	Alexander, Kenny Ho, amd-gfx, Tejun Heo, cgroups

Am 16.05.19 um 14:28 schrieb Daniel Vetter:
> [CAUTION: External Email]
>
> On Thu, May 16, 2019 at 09:25:31AM +0200, Christian König wrote:
>> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
>>> Am 16.05.19 um 04:29 schrieb Kenny Ho:
>>>> [CAUTION: External Email]
>>>>
>>>> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
>>>>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>>>>> There are four control file types,
>>>>>> stats (ro) - display current measured values for a resource
>>>>>> max (rw) - limits for a resource
>>>>>> default (ro, root cgroup only) - default values for a resource
>>>>>> help (ro, root cgroup only) - help string for a resource
>>>>>>
>>>>>> Each file is multi-lined with one entry/line per drm device.
>>>>> Multi-line is correct for multiple devices, but I believe you need
>>>>> to use a KEY to denote device for both your set and get routines.
>>>>> I didn't see your set functions reading a key, or the get functions
>>>>> printing the key in output.
>>>>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>>>>> you can use drm_minor as key?
>>>> Given this controller is specific to the drm kernel subsystem which
>>>> uses minor to identify drm device,
>>> Wait a second, using the DRM minor is a good idea in the first place.
>> Well that should have read "is not a good idea"..
> What else should we use?

Well what does for example udev uses to identify a device?

>> Christian.
>>
>>> I have a test system with a Vega10 and a Vega20. Which device gets which
>>> minor is not stable, but rather defined by the scan order of the PCIe bus.
>>>
>>> Normally the scan order is always the same, but adding or removing
>>> devices or delaying things just a little bit during init is enough to
>>> change this.
>>>
>>> We need something like the Linux sysfs location or similar to have a
>>> stable implementation.
> You can go from sysfs location to drm class directory (in sysfs) and back.
> That means if you care you need to walk sysfs yourself a bit, but using
> the drm minor isn't a blocker itself.

Yeah, agreed that userspace could do this. But I think if there is an of 
hand alternative we should use this instead.

> One downside with the drm minor is that it's pretty good nonsense once you
> have more than 64 gpus though, due to how we space render and legacy nodes
> in the minor ids :-)

Ok, another good reason to at least not use the minor=linenum approach.

Christian.

> -Daniel
>>> Regards,
>>> Christian.
>>>
>>>>     I don't see a need to complicate
>>>> the interfaces more by having major and a key.  As you can see in the
>>>> examples below, the drm device minor corresponds to the line number.
>>>> I am not sure how strict cgroup upstream is about the convention but I
>>>> am hoping there are flexibility here to allow for what I have
>>>> implemented.  There are a couple of other things I have done that is
>>>> not described in the convention: 1) inclusion of read-only *.help file
>>>> at the root cgroup, 2) use read-only (which I can potentially make rw)
>>>> *.default file instead of having a default entries (since the default
>>>> can be different for different devices) inside the control files (this
>>>> way, the resetting of cgroup values for all the drm devices, can be
>>>> done by a simple 'cp'.)
>>>>
>>>>>> Usage examples:
>>>>>> // set limit for card1 to 1GB
>>>>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>>>>
>>>>>> // set limit for card0 to 512MB
>>>>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>>>>     /** @file drm_gem.c
>>>>>> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>>>>>>          obj->handle_count = 0;
>>>>>>          obj->size = size;
>>>>>>          drm_vma_node_reset(&obj->vma_node);
>>>>>> +
>>>>>> +     obj->drmcgrp = get_drmcgrp(current);
>>>>>> +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>>>>> Why do the charging here?
>>>>> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
>>>>> Is this really useful for an administrator to control?
>>>>> Isn't the resource we want to control actually the physical backing store?
>>>> That's correct.  This is just the first level of control since the
>>>> backing store can be backed by different type of memory.  I am in the
>>>> process of adding at least two more resources.  Stay tuned.  I am
>>>> doing the charge here to enforce the idea of "creator is deemed owner"
>>>> at a place where the code is shared by all (the init function.)
>>>>
>>>>>> +     while (i <= max_minor && limits != NULL) {
>>>>>> +             sval =  strsep(&limits, "\n");
>>>>>> +             rc = kstrtoll(sval, 0, &val);
>>>>> Input should be "KEY VALUE", so KEY will determine device to apply this to.
>>>>> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
>>>>>
>>>>> parse_resources() in rdma controller is example for both of above.
>>>> Please see my previous reply for the rationale of my hope to not need
>>>> a key.  I can certainly add handling of "max" and "default".
>>>>
>>>>
>>>>>> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
>>>>>> +             size_t size)
>>>>> Shouldn't this return an error and be implemented with same semantics as the
>>>>> try_charge() functions of other controllers?
>>>>> Below will allow stats_total_allocated to overrun limits_total_allocated.
>>>> This is because I am charging the buffer at the init of the buffer
>>>> which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
>>>> and placed earlier and nearer other condition where gem object
>>>> allocation may fail.  In other words, there are multiple possibilities
>>>> for which gem allocation may fail (cgroup limit being one of them) and
>>>> satisfying cgroup limit does not mean a charge is needed.  I can
>>>> certainly combine the two functions to have an additional try_charge
>>>> semantic as well if that is really needed.
>>>>
>>>> Regards,
>>>> Kenny
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]           ` <CAOWid-ftUrVVWPu9KuS8xpWKNQT6_FtxB8gEyEAn9nLD6qxb5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2019-05-16  7:16             ` Koenig, Christian
@ 2019-05-16 14:10             ` Tejun Heo
  2019-05-16 14:58               ` Kenny Ho
  1 sibling, 1 reply; 80+ messages in thread
From: Tejun Heo @ 2019-05-16 14:10 UTC (permalink / raw)
  To: Kenny Ho
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Kenny Ho, Welty, Brian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher,
	cgroups-u79uwXL29TY76Z2rM5mHXA, Christian König

Hello,

I haven't gone through the patchset yet but some quick comments.

On Wed, May 15, 2019 at 10:29:21PM -0400, Kenny Ho wrote:
> Given this controller is specific to the drm kernel subsystem which
> uses minor to identify drm device, I don't see a need to complicate
> the interfaces more by having major and a key.  As you can see in the
> examples below, the drm device minor corresponds to the line number.
> I am not sure how strict cgroup upstream is about the convention but I

We're pretty strict.

> am hoping there are flexibility here to allow for what I have
> implemented.  There are a couple of other things I have done that is

So, please follow the interface conventions.  We can definitely add
new ones but that would need functional reasons.

> not described in the convention: 1) inclusion of read-only *.help file
> at the root cgroup, 2) use read-only (which I can potentially make rw)
> *.default file instead of having a default entries (since the default
> can be different for different devices) inside the control files (this
> way, the resetting of cgroup values for all the drm devices, can be
> done by a simple 'cp'.)

Again, please follow the existing conventions.  There's a lot more
harm than good in every controller being creative in their own way.
It's trivial to build convenience features in userspace.  Please do it
there.

> > Is this really useful for an administrator to control?
> > Isn't the resource we want to control actually the physical backing store?
> That's correct.  This is just the first level of control since the
> backing store can be backed by different type of memory.  I am in the
> process of adding at least two more resources.  Stay tuned.  I am
> doing the charge here to enforce the idea of "creator is deemed owner"
> at a place where the code is shared by all (the init function.)

Ideally, controller should only control hard resources which impact
behaviors and performance which are immediately visible to users.

Thanks.

-- 
tejun
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
       [not found]                     ` <CAOWid-fQgah16ycz-V-ymsm7yKUnFTeTSBaW4MK=2mqUHhCcmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-16 14:12                         ` Christian König
  0 siblings, 0 replies; 80+ messages in thread
From: Christian König @ 2019-05-16 14:12 UTC (permalink / raw)
  To: Kenny Ho, Christian König
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Welty, Brian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

Am 16.05.19 um 16:03 schrieb Kenny Ho:
> On Thu, May 16, 2019 at 3:25 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
>>> Am 16.05.19 um 04:29 schrieb Kenny Ho:
>>>> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
>>>>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>>>>> Each file is multi-lined with one entry/line per drm device.
>>>>> Multi-line is correct for multiple devices, but I believe you need
>>>>> to use a KEY to denote device for both your set and get routines.
>>>>> I didn't see your set functions reading a key, or the get functions
>>>>> printing the key in output.
>>>>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>>>>> you can use drm_minor as key?
>>>> Given this controller is specific to the drm kernel subsystem which
>>>> uses minor to identify drm device,
>>> Wait a second, using the DRM minor is a good idea in the first place.
>> Well that should have read "is not a good idea"..
>>
>> I have a test system with a Vega10 and a Vega20. Which device gets which
>> minor is not stable, but rather defined by the scan order of the PCIe bus.
>>
>> Normally the scan order is always the same, but adding or removing
>> devices or delaying things just a little bit during init is enough to
>> change this.
>>
>> We need something like the Linux sysfs location or similar to have a
>> stable implementation.
> I get that, which is why I don't use minor to identify cards in user
> space apps I wrote:
> https://github.com/RadeonOpenCompute/k8s-device-plugin/blob/c2659c9d1d0713cad36fb5256681125121e6e32f/internal/pkg/amdgpu/amdgpu.go#L85

Yeah, that is certainly a possibility.

> But within the kernel, I think my use of minor is consistent with the
> rest of the drm subsystem.  I hope I don't need to reform the way the
> drm subsystem use minor in order to introduce a cgroup controller.

Well I would try to avoid using the minor and at least look for 
alternatives. E.g. what does udev uses to identify the devices for 
example? And IIRC we have something like a "device-name" in the kernel 
as well (what's printed in the logs).

The minimum we need to do is get away from the minor=linenum approach, 
cause as Daniel pointed out the minor allocation is quite a mess and not 
necessary contiguous.

Regards,
Christian.

>
> Regards,
> Kenny
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
@ 2019-05-16 14:12                         ` Christian König
  0 siblings, 0 replies; 80+ messages in thread
From: Christian König @ 2019-05-16 14:12 UTC (permalink / raw)
  To: Kenny Ho, Christian König
  Cc: sunnanyong-hv44wF8Li93QT0dZR+AlfA, Ho, Kenny, Welty, Brian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA

Am 16.05.19 um 16:03 schrieb Kenny Ho:
> On Thu, May 16, 2019 at 3:25 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
>>> Am 16.05.19 um 04:29 schrieb Kenny Ho:
>>>> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@intel.com> wrote:
>>>>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>>>>> Each file is multi-lined with one entry/line per drm device.
>>>>> Multi-line is correct for multiple devices, but I believe you need
>>>>> to use a KEY to denote device for both your set and get routines.
>>>>> I didn't see your set functions reading a key, or the get functions
>>>>> printing the key in output.
>>>>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>>>>> you can use drm_minor as key?
>>>> Given this controller is specific to the drm kernel subsystem which
>>>> uses minor to identify drm device,
>>> Wait a second, using the DRM minor is a good idea in the first place.
>> Well that should have read "is not a good idea"..
>>
>> I have a test system with a Vega10 and a Vega20. Which device gets which
>> minor is not stable, but rather defined by the scan order of the PCIe bus.
>>
>> Normally the scan order is always the same, but adding or removing
>> devices or delaying things just a little bit during init is enough to
>> change this.
>>
>> We need something like the Linux sysfs location or similar to have a
>> stable implementation.
> I get that, which is why I don't use minor to identify cards in user
> space apps I wrote:
> https://github.com/RadeonOpenCompute/k8s-device-plugin/blob/c2659c9d1d0713cad36fb5256681125121e6e32f/internal/pkg/amdgpu/amdgpu.go#L85

Yeah, that is certainly a possibility.

> But within the kernel, I think my use of minor is consistent with the
> rest of the drm subsystem.  I hope I don't need to reform the way the
> drm subsystem use minor in order to introduce a cgroup controller.

Well I would try to avoid using the minor and at least look for 
alternatives. E.g. what does udev uses to identify the devices for 
example? And IIRC we have something like a "device-name" in the kernel 
as well (what's printed in the logs).

The minimum we need to do is get away from the minor=linenum approach, 
cause as Daniel pointed out the minor allocation is quite a mess and not 
necessary contiguous.

Regards,
Christian.

>
> Regards,
> Kenny
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-16 14:12                         ` Christian König
  (?)
@ 2019-05-16 14:28                         ` Kenny Ho
  -1 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-16 14:28 UTC (permalink / raw)
  To: Christian König
  Cc: sunnanyong, Ho, Kenny, Welty, Brian, amd-gfx, Deucher, Alexander,
	dri-devel, Tejun Heo, cgroups

On Thu, May 16, 2019 at 10:12 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> Am 16.05.19 um 16:03 schrieb Kenny Ho:
> > On Thu, May 16, 2019 at 3:25 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 16.05.19 um 09:16 schrieb Koenig, Christian:
> >> We need something like the Linux sysfs location or similar to have a
> >> stable implementation.
> > I get that, which is why I don't use minor to identify cards in user
> > space apps I wrote:
> > https://github.com/RadeonOpenCompute/k8s-device-plugin/blob/c2659c9d1d0713cad36fb5256681125121e6e32f/internal/pkg/amdgpu/amdgpu.go#L85
>
> Yeah, that is certainly a possibility.
>
> > But within the kernel, I think my use of minor is consistent with the
> > rest of the drm subsystem.  I hope I don't need to reform the way the
> > drm subsystem use minor in order to introduce a cgroup controller.
>
> Well I would try to avoid using the minor and at least look for
> alternatives. E.g. what does udev uses to identify the devices for
> example? And IIRC we have something like a "device-name" in the kernel
> as well (what's printed in the logs).
>
> The minimum we need to do is get away from the minor=linenum approach,
> cause as Daniel pointed out the minor allocation is quite a mess and not
> necessary contiguous.

I noticed :) but looks like there isn't much of a choice from what
Tejun/cgroup replied about convention.

Regards,
Kenny
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit
  2019-05-16 14:10             ` Tejun Heo
@ 2019-05-16 14:58               ` Kenny Ho
  0 siblings, 0 replies; 80+ messages in thread
From: Kenny Ho @ 2019-05-16 14:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: sunnanyong, Kenny Ho, Welty, Brian, amd-gfx, dri-devel,
	Alex Deucher, cgroups, Christian König

On Thu, May 16, 2019 at 10:10 AM Tejun Heo <tj@kernel.org> wrote:
> I haven't gone through the patchset yet but some quick comments.
>
> On Wed, May 15, 2019 at 10:29:21PM -0400, Kenny Ho wrote:
> > Given this controller is specific to the drm kernel subsystem which
> > uses minor to identify drm device, I don't see a need to complicate
> > the interfaces more by having major and a key.  As you can see in the
> > examples below, the drm device minor corresponds to the line number.
> > I am not sure how strict cgroup upstream is about the convention but I
>
> We're pretty strict.
>
> > am hoping there are flexibility here to allow for what I have
> > implemented.  There are a couple of other things I have done that is
>
> So, please follow the interface conventions.  We can definitely add
> new ones but that would need functional reasons.
>
> > not described in the convention: 1) inclusion of read-only *.help file
> > at the root cgroup, 2) use read-only (which I can potentially make rw)
> > *.default file instead of having a default entries (since the default
> > can be different for different devices) inside the control files (this
> > way, the resetting of cgroup values for all the drm devices, can be
> > done by a simple 'cp'.)
>
> Again, please follow the existing conventions.  There's a lot more
> harm than good in every controller being creative in their own way.
> It's trivial to build convenience features in userspace.  Please do it
> there.
I can certainly remove the ro *.help file and leave the documentation
to Documentation/, but for the *.default I do have a functional reason
to it.  As far as I can tell from the convention, the default is per
cgroup and there is no way to describe per device default.  Although,
perhaps we are talking about two different kinds of defaults.  Anyway,
I can leave the discussion to a more detailed review.

Regards,
Kenny
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2019-05-16 14:58 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-20 18:58 [PATCH RFC 0/5] DRM cgroup controller Kenny Ho
2018-11-20 18:58 ` [PATCH RFC 1/5] cgroup: Introduce cgroup for drm subsystem Kenny Ho
     [not found] ` <20181120185814.13362-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2018-11-20 18:58   ` [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices Kenny Ho
     [not found]     ` <20181120185814.13362-3-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2018-11-20 20:21       ` Tejun Heo
     [not found]         ` <20181120202141.GA2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2018-11-20 22:21           ` Ho, Kenny
     [not found]             ` <DM5PR12MB1226E972538A45325114ADF683D90-2J9CzHegvk+lTFawYev2gQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-11-20 22:30               ` Tejun Heo
     [not found]                 ` <20181120223018.GB2509588-LpCCV3molIbIZ9tKgghJQw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2018-11-21 22:07                   ` Ho, Kenny
2018-11-21 22:12                   ` Ho, Kenny
2018-11-26 20:59                   ` Kasiviswanathan, Harish
2018-11-27  9:38                     ` Koenig, Christian
2018-11-27  9:46                     ` [Intel-gfx] " Joonas Lahtinen
2018-11-27 15:41                       ` Ho, Kenny
2018-11-28  9:14                         ` Joonas Lahtinen
2018-11-28 19:46                           ` Ho, Kenny
2018-11-30 22:22                             ` Matt Roper
     [not found]                               ` <20181130222228.GE31345-b/RNqDZ/lqH1fpGqjiHozbKMmGWinSIL2HeeBUIffwg@public.gmane.org>
2018-12-03  6:46                                 ` [Intel-gfx] " Ho, Kenny
2018-12-03 18:58                                   ` Matt Roper
     [not found]                           ` <154339645444.5339.6291298808444340104-zzJjBcU1GAT9BXuAQUXR0fooFf0ArEBIu+b9c/7xato@public.gmane.org>
2018-12-03 20:55                             ` Kuehling, Felix
2018-12-03 20:55                               ` Kuehling, Felix
     [not found]                               ` <219f8754-3e14-05ad-07a3-6cddb8bb74aa-5C7GfCeVMHo@public.gmane.org>
2018-12-05 14:20                                 ` Joonas Lahtinen
2018-12-05 14:20                                   ` Joonas Lahtinen
2018-11-21  9:53       ` Christian König
2018-11-20 18:58   ` [PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices Kenny Ho
     [not found]     ` <20181120185814.13362-4-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2018-11-21  9:55       ` Christian König
2018-11-20 18:58   ` [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup Kenny Ho
2018-11-20 20:57     ` Eric Anholt
2018-11-20 20:57       ` Eric Anholt
     [not found]       ` <87r2ff79he.fsf-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
2018-11-21 10:03         ` Christian König
2018-11-23 17:36           ` Eric Anholt
     [not found]             ` <871s7b7l2b.fsf-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
2018-11-23 18:13               ` Koenig, Christian
2018-11-23 18:13                 ` Koenig, Christian
     [not found]                 ` <095e010c-e3b8-ec79-c87b-a05ce1d95e10-5C7GfCeVMHo@public.gmane.org>
2018-11-23 19:09                   ` Ho, Kenny
2018-11-23 19:09                     ` Ho, Kenny
2018-11-21  9:58     ` Christian König
2018-11-20 18:58 ` [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request " Kenny Ho
     [not found]   ` <20181120185814.13362-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2018-11-20 20:56     ` Eric Anholt
2018-11-20 20:56       ` Eric Anholt
2018-11-21 10:00     ` Christian König
2018-11-27 18:15       ` Kenny Ho
     [not found]         ` <CAOWid-fMFUvT_XQijRd34+cUOxM=zbbf+HwWv_NbqO-rBo2d_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-11-27 20:31           ` Christian König
     [not found]             ` <3299d9d6-e272-0459-8f63-0c81d11cde1e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-11-27 20:36               ` Kenny Ho
2018-11-21  1:43 ` ✗ Fi.CI.BAT: failure for DRM cgroup controller Patchwork
2019-05-09 21:04 ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Kenny Ho
     [not found]   ` <20190509210410.5471-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-05-09 21:04     ` [RFC PATCH v2 1/5] cgroup: Introduce cgroup for drm subsystem Kenny Ho
2019-05-09 21:04     ` [RFC PATCH v2 2/5] cgroup: Add mechanism to register DRM devices Kenny Ho
2019-05-09 21:04     ` [RFC PATCH v2 5/5] drm, cgroup: Add peak GEM buffer allocation limit Kenny Ho
     [not found]       ` <20190509210410.5471-6-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-05-10 12:29         ` Christian König
2019-05-10 12:31     ` [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem Christian König
2019-05-10 15:07       ` Kenny Ho
2019-05-10 15:07         ` Kenny Ho
     [not found]         ` <CAOWid-dJZrnAifFYByh4p9x-jA1o_5YWkoNVAVbdRUaxzdPbGA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-10 17:46           ` Koenig, Christian
2019-05-10 17:46             ` Koenig, Christian
2019-05-09 21:04   ` [RFC PATCH v2 3/5] drm/amdgpu: Register AMD devices for DRM cgroup Kenny Ho
2019-05-09 21:04   ` [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit Kenny Ho
     [not found]     ` <20190509210410.5471-5-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-05-10 12:28       ` Christian König
     [not found]         ` <f63c8d6b-92a4-2977-d062-7e0b7036834e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-05-10 14:57           ` Kenny Ho
2019-05-10 14:57             ` Kenny Ho
2019-05-10 15:08             ` Koenig, Christian
2019-05-10 15:08               ` Koenig, Christian
     [not found]               ` <1ca1363e-b39c-c299-1d24-098b1059f7ff-5C7GfCeVMHo@public.gmane.org>
2019-05-10 15:25                 ` Kenny Ho
2019-05-10 15:25                   ` Kenny Ho
2019-05-10 17:48                   ` Koenig, Christian
2019-05-10 17:48                     ` Koenig, Christian
2019-05-10 18:50                     ` Kenny Ho
     [not found]                       ` <CAOWid-es+C_iStQUkM52mO3TeP8eS9MX+emZDQNH2PyZCf=RHQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-13 15:10                         ` Daniel Vetter
2019-05-15 21:26     ` Welty, Brian
     [not found]       ` <d81e8f55-9602-818e-0f9c-1d9d150133b1-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2019-05-16  2:29         ` Kenny Ho
     [not found]           ` <CAOWid-ftUrVVWPu9KuS8xpWKNQT6_FtxB8gEyEAn9nLD6qxb5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-16  7:16             ` Koenig, Christian
2019-05-16  7:25               ` Christian König
     [not found]                 ` <6e124f5e-f83f-5ca1-4616-92538f202653-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-05-16 12:28                   ` Daniel Vetter
2019-05-16 12:28                     ` Daniel Vetter
2019-05-16 14:08                     ` Koenig, Christian
2019-05-16 14:08                       ` Koenig, Christian
2019-05-16 14:03                   ` Kenny Ho
2019-05-16 14:03                     ` Kenny Ho
     [not found]                     ` <CAOWid-fQgah16ycz-V-ymsm7yKUnFTeTSBaW4MK=2mqUHhCcmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-16 14:12                       ` Christian König
2019-05-16 14:12                         ` Christian König
2019-05-16 14:28                         ` Kenny Ho
2019-05-16 14:10             ` Tejun Heo
2019-05-16 14:58               ` Kenny Ho

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.