All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
@ 2020-12-09 20:54 Vipin Sharma
  2020-12-09 20:54 ` [Patch v3 1/2] cgroup: svm: Add Encryption ID controller Vipin Sharma
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Vipin Sharma @ 2020-12-09 20:54 UTC (permalink / raw)
  To: thomas.lendacky, brijesh.singh, jon.grimm, eric.vantassell,
	pbonzini, seanjc, tj, lizefan, hannes, frankja, borntraeger,
	corbet
  Cc: joro, vkuznets, wanpengli, jmattson, tglx, mingo, bp, hpa,
	gingell, rientjes, dionnaglaze, kvm, x86, cgroups, linux-doc,
	linux-kernel, Vipin Sharma

Hello,

This patch adds a new cgroup controller, Encryption IDs, to track and
limit the usage of encryption IDs on a host.

AMD provides Secure Encrypted Virtualization (SEV) and SEV with
Encrypted State (SEV-ES) to encrypt the guest OS's memory using limited
number of Address Space Identifiers (ASIDs).

This limited number of ASIDs creates issues like SEV ASID starvation and
unoptimized scheduling in the cloud infrastucture.

In the RFC patch v1, I provided only SEV cgroup controller but based
on the feedback and discussion it became clear that this cgroup
controller can be extended to be used by Intel's Trusted Domain
Extension (TDX) and s390's protected virtualization Secure Execution IDs
(SEID)

This patch series provides a generic Encryption IDs controller with
tracking support of the SEV ASIDs.

Changes in v3:
- Fixes a build error when CONFIG_CGROUP is disabled.

Changes in v2:
- Changed cgroup name from sev to encryption_ids.
- Replaced SEV specific names in APIs and documentations with generic
  encryption IDs.
- Providing 3 cgroup files per encryption ID type. For example in SEV,
  - encryption_ids.sev.stat (only in the root cgroup directory).
  - encryption_ids.sev.max
  - encryption_ids.sev.current

Thanks
Vipin Sharma

[1] https://lore.kernel.org/lkml/20200922004024.3699923-1-vipinsh@google.com/#r
[2] https://lore.kernel.org/lkml/20201208213531.2626955-1-vipinsh@google.com/

 .../admin-guide/cgroup-v1/encryption_ids.rst  | 108 +++++
 Documentation/admin-guide/cgroup-v2.rst       |  78 +++-
 arch/x86/kvm/svm/sev.c                        |  28 +-
 include/linux/cgroup_subsys.h                 |   4 +
 include/linux/encryption_ids_cgroup.h         |  71 +++
 include/linux/kvm_host.h                      |   4 +
 init/Kconfig                                  |  14 +
 kernel/cgroup/Makefile                        |   1 +
 kernel/cgroup/encryption_ids.c                | 430 ++++++++++++++++++
 9 files changed, 729 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/admin-guide/cgroup-v1/encryption_ids.rst
 create mode 100644 include/linux/encryption_ids_cgroup.h
 create mode 100644 kernel/cgroup/encryption_ids.c

-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Patch v3 1/2] cgroup: svm: Add Encryption ID controller
  2020-12-09 20:54 [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller Vipin Sharma
@ 2020-12-09 20:54 ` Vipin Sharma
  2020-12-09 20:54 ` [Patch v3 2/2] cgroup: svm: Encryption IDs cgroup documentation Vipin Sharma
  2020-12-09 20:58   ` Tejun Heo
  2 siblings, 0 replies; 13+ messages in thread
From: Vipin Sharma @ 2020-12-09 20:54 UTC (permalink / raw)
  To: thomas.lendacky, brijesh.singh, jon.grimm, eric.vantassell,
	pbonzini, seanjc, tj, lizefan, hannes, frankja, borntraeger,
	corbet
  Cc: joro, vkuznets, wanpengli, jmattson, tglx, mingo, bp, hpa,
	gingell, rientjes, dionnaglaze, kvm, x86, cgroups, linux-doc,
	linux-kernel, Vipin Sharma, kernel test robot

Hardware memory encryption is available on multiple generic CPUs. For
example AMD has Secure Encrypted Virtualization (SEV) and SEV -
Encrypted State (SEV-ES).

These memory encryptions are useful in creating encrypted virtual
machines (VMs) and user space programs.

There are limited number of encryption IDs that can be used
simultaneously on a machine for encryption. This generates a need for
the system admin to track, limit, allocate resources, and optimally
schedule VMs and user workloads in the cloud infrastructure. Some
malicious programs can exhaust all of these resources on a host causing
starvation of other workloads.

Encryption ID controller allows control of these resources using
Cgroups.

Controller is enabled by CGROUP_ENCRYPTION_IDS config option.
Encryption controller provide 3 interface files for each encryption ID
type. For example, in SEV:

1. encryption_ids.sev.stat
	Shown only at the root cgroup. Displays total SEV IDs available
	on the platform and current usage count.
2. encrpytion_ids.sev.max
	Sets the maximum usage of SEV IDs in the cgroup.
3. encryption_ids.sev.current
	Current usage of SEV IDs in the cgroup and its children.

Other ID types can be easily added in the controller in the same way.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Rientjes <rientjes@google.com>
Reviewed-by: Dionna Glaze <dionnaglaze@google.com>
Reported-by: kernel test robot <lkp@intel.com>
---
 arch/x86/kvm/svm/sev.c                |  28 +-
 include/linux/cgroup_subsys.h         |   4 +
 include/linux/encryption_ids_cgroup.h |  71 +++++
 include/linux/kvm_host.h              |   4 +
 init/Kconfig                          |  14 +
 kernel/cgroup/Makefile                |   1 +
 kernel/cgroup/encryption_ids.c        | 430 ++++++++++++++++++++++++++
 7 files changed, 545 insertions(+), 7 deletions(-)
 create mode 100644 include/linux/encryption_ids_cgroup.h
 create mode 100644 kernel/cgroup/encryption_ids.c

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 566f4d18185b..83c23d1d568a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -14,6 +14,7 @@
 #include <linux/psp-sev.h>
 #include <linux/pagemap.h>
 #include <linux/swap.h>
+#include <linux/encryption_ids_cgroup.h>
 
 #include "x86.h"
 #include "svm.h"
@@ -77,10 +78,14 @@ static bool __sev_recycle_asids(void)
 	return true;
 }
 
-static int sev_asid_new(void)
+static int sev_asid_new(struct kvm *kvm)
 {
 	bool retry = true;
-	int pos;
+	int pos, ret;
+
+	ret = enc_id_cg_try_charge(kvm, ENCRYPTION_ID_SEV, 1);
+	if (ret)
+		return ret;
 
 	mutex_lock(&sev_bitmap_lock);
 
@@ -95,7 +100,8 @@ static int sev_asid_new(void)
 			goto again;
 		}
 		mutex_unlock(&sev_bitmap_lock);
-		return -EBUSY;
+		ret = -EBUSY;
+		goto e_uncharge;
 	}
 
 	__set_bit(pos, sev_asid_bitmap);
@@ -103,6 +109,9 @@ static int sev_asid_new(void)
 	mutex_unlock(&sev_bitmap_lock);
 
 	return pos + 1;
+e_uncharge:
+	enc_id_cg_uncharge(kvm, ENCRYPTION_ID_SEV, 1);
+	return ret;
 }
 
 static int sev_get_asid(struct kvm *kvm)
@@ -112,7 +121,7 @@ static int sev_get_asid(struct kvm *kvm)
 	return sev->asid;
 }
 
-static void sev_asid_free(int asid)
+static void sev_asid_free(struct kvm *kvm, int asid)
 {
 	struct svm_cpu_data *sd;
 	int cpu, pos;
@@ -128,6 +137,7 @@ static void sev_asid_free(int asid)
 	}
 
 	mutex_unlock(&sev_bitmap_lock);
+	enc_id_cg_uncharge(kvm, ENCRYPTION_ID_SEV, 1);
 }
 
 static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
@@ -172,7 +182,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	if (unlikely(sev->active))
 		return ret;
 
-	asid = sev_asid_new();
+	asid = sev_asid_new(kvm);
 	if (asid < 0)
 		return ret;
 
@@ -187,7 +197,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return 0;
 
 e_free:
-	sev_asid_free(asid);
+	sev_asid_free(kvm, asid);
 	return ret;
 }
 
@@ -1122,7 +1132,7 @@ void sev_vm_destroy(struct kvm *kvm)
 	mutex_unlock(&kvm->lock);
 
 	sev_unbind_asid(kvm, sev->handle);
-	sev_asid_free(sev->asid);
+	sev_asid_free(kvm, sev->asid);
 }
 
 int __init sev_hardware_setup(void)
@@ -1148,6 +1158,9 @@ int __init sev_hardware_setup(void)
 	if (!sev_reclaim_asid_bitmap)
 		return 1;
 
+	if (enc_id_cg_set_capacity(ENCRYPTION_ID_SEV, max_sev_asid))
+		return 1;
+
 	status = kmalloc(sizeof(*status), GFP_KERNEL);
 	if (!status)
 		return 1;
@@ -1177,6 +1190,7 @@ void sev_hardware_teardown(void)
 
 	bitmap_free(sev_asid_bitmap);
 	bitmap_free(sev_reclaim_asid_bitmap);
+	enc_id_cg_set_capacity(ENCRYPTION_ID_SEV, 0);
 
 	sev_flush_asids();
 }
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index acb77dcff3b4..83754f58c05e 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -61,6 +61,10 @@ SUBSYS(pids)
 SUBSYS(rdma)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_ENCRYPTION_IDS)
+SUBSYS(encryption_ids)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/include/linux/encryption_ids_cgroup.h b/include/linux/encryption_ids_cgroup.h
new file mode 100644
index 000000000000..c450ae476de5
--- /dev/null
+++ b/include/linux/encryption_ids_cgroup.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Encryption IDs cgroup controller.
+ *
+ * Copyright 2020 Google LLC
+ * Author: Vipin Sharma <vipinsh@google.com>
+ */
+#ifndef _ENCRYPTION_IDS_CGROUP_H_
+#define _ENCRYPTION_IDS_CGROUP_H_
+
+#include <linux/cgroup-defs.h>
+#include <linux/kvm_types.h>
+
+/**
+ * Types of encryption IDs supported by the host.
+ */
+enum encryption_id_type {
+#ifdef CONFIG_KVM_AMD_SEV
+	ENCRYPTION_ID_SEV,
+#endif
+	ENCRYPTION_ID_TYPES
+};
+
+#ifdef CONFIG_CGROUP_ENCRYPTION_IDS
+
+/**
+ * struct encryption_id_res: Per cgroup per encryption ID resource
+ * @max: Maximum count of encryption ID that can be used.
+ * @usage: Current usage of encryption ID in the cgroup.
+ */
+struct encryption_id_res {
+	unsigned int max;
+	unsigned int usage;
+};
+
+/**
+ * struct encryption_id_cgroup - Encryption IDs controller's cgroup structure.
+ * @css: cgroup subsys state object.
+ * @ids: Array of encryption IDs resource usage in the cgroup.
+ */
+struct encryption_id_cgroup {
+	struct cgroup_subsys_state css;
+	struct encryption_id_res res[ENCRYPTION_ID_TYPES];
+};
+
+int enc_id_cg_set_capacity(enum encryption_id_type type, unsigned int capacity);
+int enc_id_cg_try_charge(struct kvm *kvm, enum encryption_id_type type,
+			 unsigned int amount);
+void enc_id_cg_uncharge(struct kvm *kvm, enum encryption_id_type type,
+			unsigned int amount);
+#else
+static inline int enc_id_cg_set_capacity(enum encryption_id_type type,
+					 unsigned int capacity)
+{
+	return 0;
+}
+
+static inline int enc_id_cg_try_charge(struct kvm *kvm,
+				       enum encryption_id_type type,
+				       unsigned int amount)
+{
+	return 0;
+}
+
+static inline void enc_id_cg_uncharge(struct kvm *kvm,
+				      enum encryption_id_type type,
+				      unsigned int amount)
+{
+}
+#endif /* CONFIG_CGROUP_ENCRYPTION_IDS */
+#endif /* _ENCRYPTION_CGROUP_H_ */
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7f2e2a09ebbd..4c20b8fb1374 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -27,6 +27,7 @@
 #include <linux/refcount.h>
 #include <linux/nospec.h>
 #include <asm/signal.h>
+#include <linux/encryption_ids_cgroup.h>
 
 #include <linux/kvm.h>
 #include <linux/kvm_para.h>
@@ -505,6 +506,9 @@ struct kvm {
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
 	unsigned int max_halt_poll_ns;
+#ifdef CONFIG_CGROUP_ENCRYPTION_IDS
+	struct encryption_id_cgroup *enc_id_cg;
+#endif
 };
 
 #define kvm_err(fmt, ...) \
diff --git a/init/Kconfig b/init/Kconfig
index 0872a5a2e759..8266acf92a54 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1106,6 +1106,20 @@ config CGROUP_BPF
 	  BPF_CGROUP_INET_INGRESS will be executed on the ingress path of
 	  inet sockets.
 
+config CGROUP_ENCRYPTION_IDS
+	bool "Encryption IDs controller"
+	depends on KVM_AMD_SEV
+	default n
+	help
+	  Provides a controller for CPU encryption IDs on a host.
+
+	  Some platforms have limited number of encryption IDs which can be
+	  used simultaneously, e.g., AMD's Secure Encrypted Virtualization
+	  (SEV). This controller tracks and limits the total number of IDs used
+	  by processes attached to a cgroup hierarchy. For more information,
+	  please check Encryption IDs section in
+	  /Documentation/admin-guide/cgroup-v2.rst.
+
 config CGROUP_DEBUG
 	bool "Debug controller"
 	default n
diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index 5d7a76bfbbb7..6c19208dfb7f 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -5,4 +5,5 @@ obj-$(CONFIG_CGROUP_FREEZER) += legacy_freezer.o
 obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
+obj-$(CONFIG_CGROUP_ENCRYPTION_IDS) += encryption_ids.o
 obj-$(CONFIG_CGROUP_DEBUG) += debug.o
diff --git a/kernel/cgroup/encryption_ids.c b/kernel/cgroup/encryption_ids.c
new file mode 100644
index 000000000000..2a3ae6b45d7c
--- /dev/null
+++ b/kernel/cgroup/encryption_ids.c
@@ -0,0 +1,430 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Encryption IDs cgroup controller
+ *
+ * Copyright 2020 Google LLC
+ * Author: Vipin Sharma <vipinsh@google.com>
+ */
+
+#include <linux/limits.h>
+#include <linux/cgroup.h>
+#include <linux/errno.h>
+#include <linux/spinlock.h>
+#include <linux/lockdep.h>
+#include <linux/slab.h>
+#include <linux/kvm_host.h>
+#include <linux/encryption_ids_cgroup.h>
+
+#define MAX_STR "max"
+#define MAX_NUM UINT_MAX
+
+/* Root Encryption ID cgroup */
+static struct encryption_id_cgroup root_cg;
+
+/* Lock for tracking and updating encryption ID resources. */
+static DEFINE_SPINLOCK(enc_id_cg_lock);
+
+/* Encryption ID types capacity. */
+static unsigned int enc_id_capacity[ENCRYPTION_ID_TYPES];
+
+/**
+ * css_enc() - Get encryption ID cgroup from the css.
+ * @css: cgroup subsys state object.
+ *
+ * Context: Any context.
+ * Return:
+ * * %NULL - If @css is null.
+ * * struct encryption_id_cgroup* - Encryption ID cgroup pointer of the passed
+ *				    css.
+ */
+static struct encryption_id_cgroup *css_enc(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct encryption_id_cgroup, css) : NULL;
+}
+
+/**
+ * parent_enc() - Get the parent of the passed encryption ID cgroup.
+ * @cgroup: cgroup whose parent needs to be fetched.
+ *
+ * Context: Any context.
+ * Return:
+ * * struct encryption_id_cgroup* - Parent of the @cgroup.
+ * * %NULL - If @cgroup is null or the passed cgroup does not have a parent.
+ */
+static struct encryption_id_cgroup *
+parent_enc(struct encryption_id_cgroup *cgroup)
+{
+	return cgroup ? css_enc(cgroup->css.parent) : NULL;
+}
+
+/**
+ * valid_type() - Check if @type is valid or not.
+ * @type: encryption ID type.
+ *
+ * Context: Any context.
+ * Return:
+ * * true - If valid type.
+ * * false - If not valid type.
+ */
+static inline bool valid_type(enum encryption_id_type type)
+{
+	return type >= 0 && type < ENCRYPTION_ID_TYPES;
+}
+
+/**
+ * enc_id_cg_uncharge_hierarchy() - Uncharge the enryption ID cgroup hierarchy.
+ * @start_cg: Starting cgroup.
+ * @stop_cg: cgroup at which uncharge stops.
+ * @type: type of encryption ID to uncharge.
+ * @amount: Charge amount.
+ *
+ * Uncharge the cgroup tree from the given start cgroup to the stop cgroup.
+ *
+ * Context: Any context. Expects enc_id_cg_lock to be held by the caller.
+ */
+static void enc_id_cg_uncharge_hierarchy(struct encryption_id_cgroup *start_cg,
+					 struct encryption_id_cgroup *stop_cg,
+					 enum encryption_id_type type,
+					 unsigned int amount)
+{
+	struct encryption_id_cgroup *i;
+
+	lockdep_assert_held(&enc_id_cg_lock);
+
+	for (i = start_cg; i != stop_cg; i = parent_enc(i)) {
+		WARN_ON_ONCE(i->res[type].usage < amount);
+		i->res[type].usage -= amount;
+	}
+	css_put(&start_cg->css);
+}
+
+/**
+ * enc_id_cg_set_capacity() - Set the capacity of the encryption ID.
+ * @type: Type of the encryption ID.
+ * @capacity: Supported capacity of the encryption ID on the host.
+ *
+ * If capacity is 0 then the charging a cgroup fails for the encryption ID.
+ *
+ * Context: Any context. Takes and releases enc_id_cg_lock.
+ * Return:
+ * * %0 - Successfully registered the capacity.
+ * * %-EINVAL - If @type is invalid.
+ * * %-EBUSY - If current usage is more than the capacity.
+ */
+int enc_id_cg_set_capacity(enum encryption_id_type type, unsigned int capacity)
+{
+	int ret = 0;
+	unsigned long flags;
+
+	if (!valid_type(type))
+		return -EINVAL;
+
+	spin_lock_irqsave(&enc_id_cg_lock, flags);
+
+	if (WARN_ON_ONCE(root_cg.res[type].usage > capacity))
+		ret = -EBUSY;
+	else
+		enc_id_capacity[type] = capacity;
+
+	spin_unlock_irqrestore(&enc_id_cg_lock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL(enc_id_cg_set_capacity);
+
+/**
+ * enc_id_cg_try_charge() - Try charging encryption ID cgroup.
+ * @kvm: kvm to store charged cgroup.
+ * @type: Encryption ID type to charge.
+ * @amount: Amount to charge.
+ *
+ * Charge @amount to the cgroup to which the current task belongs to. Charged
+ * cgroup will be pointed by @cg. Caller must use the same cgroup during
+ * uncharge call.
+ *
+ * Context: Any context. Takes and releases enc_id_cg_lock.
+ * Return:
+ * * %0 - If successfully charged.
+ * * -EINVAL - If @type is invalid or encryption ID has 0 capacity.
+ * * -EBUSY - If max limit will be crossed or total usage will be more than the
+ *	      capacity.
+ */
+int enc_id_cg_try_charge(struct kvm *kvm, enum encryption_id_type type,
+			 unsigned int amount)
+{
+	struct encryption_id_cgroup *task_cg, *i;
+	struct encryption_id_res *id_res;
+	int ret;
+	unsigned int new_usage;
+	unsigned long flags;
+
+	if (!valid_type(type) || !kvm)
+		return -EINVAL;
+
+	if (!amount)
+		return 0;
+
+	spin_lock_irqsave(&enc_id_cg_lock, flags);
+
+	if (!enc_id_capacity[type]) {
+		ret = -EINVAL;
+		goto err_capacity;
+	}
+
+	task_cg = css_enc(task_get_css(current, encryption_ids_cgrp_id));
+
+	for (i = task_cg; i; i = parent_enc(i)) {
+		id_res = &i->res[type];
+
+		new_usage = id_res->usage + amount;
+		WARN_ON_ONCE(new_usage < id_res->usage);
+
+		if (new_usage > id_res->max ||
+		    new_usage > enc_id_capacity[type]) {
+			ret = -EBUSY;
+			goto err_charge;
+		}
+
+		id_res->usage = new_usage;
+	}
+
+	kvm->enc_id_cg = task_cg;
+	spin_unlock_irqrestore(&enc_id_cg_lock, flags);
+	return 0;
+
+err_charge:
+	enc_id_cg_uncharge_hierarchy(task_cg, i, type, amount);
+err_capacity:
+	spin_unlock_irqrestore(&enc_id_cg_lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(enc_id_cg_try_charge);
+
+/**
+ * enc_id_cg_uncharge() - Uncharge the encryption ID cgroup.
+ * @kvm: kvm containing the corresponding encryption ID cgroup.
+ * @type: Encryption ID which was charged.
+ * @amount: Charged amount.
+ *
+ * Context: Any context. Takes and releases enc_id_cg_lock.
+ */
+void enc_id_cg_uncharge(struct kvm *kvm, enum encryption_id_type type,
+			unsigned int amount)
+{
+	unsigned long flags;
+
+	if (!amount)
+		return;
+	if (!valid_type(type))
+		return;
+	if (!kvm || WARN_ON_ONCE(!(kvm->enc_id_cg)))
+		return;
+
+	spin_lock_irqsave(&enc_id_cg_lock, flags);
+	enc_id_cg_uncharge_hierarchy(kvm->enc_id_cg, NULL, type, amount);
+	spin_unlock_irqrestore(&enc_id_cg_lock, flags);
+
+	kvm->enc_id_cg = NULL;
+}
+EXPORT_SYMBOL(enc_id_cg_uncharge);
+
+/**
+ * enc_id_cg_max_show() - Show encryption ID cgroup max limit.
+ * @sf: Interface file
+ * @v: Arguments passed
+ *
+ * Uses cft->private value to determine for which enryption ID type results be
+ * shown.
+ *
+ * Context: Any context.
+ * Return: 0 to denote successful print.
+ */
+static int enc_id_cg_max_show(struct seq_file *sf, void *v)
+{
+	struct encryption_id_cgroup *cg = css_enc(seq_css(sf));
+	enum encryption_id_type type = seq_cft(sf)->private;
+
+	if (cg->res[type].max == MAX_NUM)
+		seq_printf(sf, "%s\n", MAX_STR);
+	else
+		seq_printf(sf, "%u\n", cg->res[type].max);
+
+	return 0;
+}
+
+/**
+ * enc_id_cg_max_write() - Update the maximum limit of the cgroup.
+ * @of: Handler for the file.
+ * @buf: Data from the user. It should be either "max", 0, or a positive
+ *	 integer.
+ * @nbytes: Number of bytes of the data.
+ * @off: Offset in the file.
+ *
+ * Uses cft->private value to determine for which enryption ID type results be
+ * shown.
+ *
+ * Context: Any context. Takes and releases enc_id_cg_lock.
+ * Return:
+ * * >= 0 - Number of bytes processed in the input.
+ * * -EINVAL - If buf is not valid.
+ * * -ERANGE - If number is bigger than unsigned int capacity.
+ * * -EBUSY - If usage can become more than max limit.
+ */
+static ssize_t enc_id_cg_max_write(struct kernfs_open_file *of, char *buf,
+				   size_t nbytes, loff_t off)
+{
+	struct encryption_id_cgroup *cg;
+	unsigned int max;
+	int ret = 0;
+	unsigned long flags;
+	enum encryption_id_type type;
+
+	buf = strstrip(buf);
+	if (!strcmp(MAX_STR, buf)) {
+		max = UINT_MAX;
+	} else {
+		ret = kstrtouint(buf, 0, &max);
+		if (ret)
+			return ret;
+	}
+
+	cg = css_enc(of_css(of));
+	type = of_cft(of)->private;
+
+	spin_lock_irqsave(&enc_id_cg_lock, flags);
+
+	if (cg->res[type].usage <= max)
+		cg->res[type].max = max;
+	else
+		ret = -EBUSY;
+
+	spin_unlock_irqrestore(&enc_id_cg_lock, flags);
+
+	return ret ? ret : nbytes;
+}
+
+/**
+ * enc_id_cg_current_read() - Show current usage of the encryption ID.
+ * @css: css pointer of the cgroup.
+ * @cft: cft pointer of the cgroup.
+ *
+ * Uses cft->private value to determine for which enryption ID type results be
+ * shown.
+ *
+ * Context: Any context.
+ * Return: 0 to denote successful print.
+ */
+static u64 enc_id_cg_current_read(struct cgroup_subsys_state *css,
+				  struct cftype *cft)
+{
+	struct encryption_id_cgroup *cg = css_enc(css);
+	enum encryption_id_type type = cft->private;
+
+	return cg->res[type].usage;
+}
+
+/**
+ * enc_id_cg_stat_show() - Show the current stat of the cgroup.
+ * @sf: Interface file
+ * @v: Arguments passed
+ *
+ * Shows the total capacity of the encryption ID and its current usage.
+ * Only shows in root cgroup directory.
+ *
+ * Uses cft->private value to determine for which enryption ID type results be
+ * shown.
+ *
+ * Context: Any context. Takes and releases enc_id_cg_lock.
+ * Return: 0 to denote successful print.
+ */
+static int enc_id_cg_stat_show(struct seq_file *sf, void *v)
+{
+	unsigned long flags;
+	enum encryption_id_type type = seq_cft(sf)->private;
+
+	spin_lock_irqsave(&enc_id_cg_lock, flags);
+
+	seq_printf(sf, "total %u\n", enc_id_capacity[type]);
+	seq_printf(sf, "used %u\n", root_cg.res[type].usage);
+
+	spin_unlock_irqrestore(&enc_id_cg_lock, flags);
+	return 0;
+}
+
+/* Each encryption ID type has these cgroup files. */
+#define ENC_ID_CGROUP_FILES(id_name, id_type)		\
+	[(id_type) * 3] = {				\
+		.name = id_name ".max",			\
+		.write = enc_id_cg_max_write,		\
+		.seq_show = enc_id_cg_max_show,		\
+		.flags = CFTYPE_NOT_ON_ROOT,		\
+		.private = id_type,			\
+	},						\
+	[((id_type) * 3) + 1] = {			\
+		.name = id_name ".current",		\
+		.read_u64 = enc_id_cg_current_read,	\
+		.flags = CFTYPE_NOT_ON_ROOT,		\
+		.private = id_type,			\
+	},						\
+	[((id_type) * 3) + 2] = {			\
+		.name = id_name ".stat",		\
+		.seq_show = enc_id_cg_stat_show,	\
+		.flags = CFTYPE_ONLY_ON_ROOT,		\
+		.private = id_type,			\
+	}
+
+/* Encryption ID cgroup interface files */
+static struct cftype enc_id_cg_files[] = {
+#ifdef CONFIG_KVM_AMD_SEV
+	ENC_ID_CGROUP_FILES("sev", ENCRYPTION_ID_SEV),
+#endif
+	{}
+};
+
+/**
+ * enc_id_cg_alloc() - Allocate encryption ID cgroup.
+ * @parent_css: Parent cgroup.
+ *
+ * Context: Process context.
+ * Return:
+ * * struct cgroup_subsys_state* - css of the allocated cgroup.
+ * * ERR_PTR(-ENOMEM) - No memory available to allocate.
+ */
+static struct cgroup_subsys_state *
+enc_id_cg_alloc(struct cgroup_subsys_state *parent_css)
+{
+	enum encryption_id_type i;
+	struct encryption_id_cgroup *cg;
+
+	if (!parent_css) {
+		cg = &root_cg;
+	} else {
+		cg = kzalloc(sizeof(*cg), GFP_KERNEL);
+		if (!cg)
+			return ERR_PTR(-ENOMEM);
+	}
+
+	for (i = 0; i < ENCRYPTION_ID_TYPES; i++)
+		cg->res[i].max = MAX_NUM;
+
+	return &cg->css;
+}
+
+/**
+ * enc_id_cg_free() - Free the encryption ID cgroup.
+ * @css: cgroup subsys object.
+ *
+ * Context: Any context.
+ */
+static void enc_id_cg_free(struct cgroup_subsys_state *css)
+{
+	kfree(css_enc(css));
+}
+
+/* Cgroup controller callbacks */
+struct cgroup_subsys encryption_ids_cgrp_subsys = {
+	.css_alloc = enc_id_cg_alloc,
+	.css_free = enc_id_cg_free,
+	.legacy_cftypes = enc_id_cg_files,
+	.dfl_cftypes = enc_id_cg_files,
+};
-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Patch v3 2/2] cgroup: svm: Encryption IDs cgroup documentation.
  2020-12-09 20:54 [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller Vipin Sharma
  2020-12-09 20:54 ` [Patch v3 1/2] cgroup: svm: Add Encryption ID controller Vipin Sharma
@ 2020-12-09 20:54 ` Vipin Sharma
  2020-12-09 20:58   ` Tejun Heo
  2 siblings, 0 replies; 13+ messages in thread
From: Vipin Sharma @ 2020-12-09 20:54 UTC (permalink / raw)
  To: thomas.lendacky, brijesh.singh, jon.grimm, eric.vantassell,
	pbonzini, seanjc, tj, lizefan, hannes, frankja, borntraeger,
	corbet
  Cc: joro, vkuznets, wanpengli, jmattson, tglx, mingo, bp, hpa,
	gingell, rientjes, dionnaglaze, kvm, x86, cgroups, linux-doc,
	linux-kernel, Vipin Sharma

Documentation for both cgroup versions, v1 and v2, of Encryption IDs
controller. This new controller is used to track and limit usage of
hardware memory encryption capabilities on the CPUs.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Rientjes <rientjes@google.com>
Reviewed-by: Dionna Glaze <dionnaglaze@google.com>
---
 .../admin-guide/cgroup-v1/encryption_ids.rst  | 108 ++++++++++++++++++
 Documentation/admin-guide/cgroup-v2.rst       |  78 ++++++++++++-
 2 files changed, 184 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/admin-guide/cgroup-v1/encryption_ids.rst

diff --git a/Documentation/admin-guide/cgroup-v1/encryption_ids.rst b/Documentation/admin-guide/cgroup-v1/encryption_ids.rst
new file mode 100644
index 000000000000..891143b4e229
--- /dev/null
+++ b/Documentation/admin-guide/cgroup-v1/encryption_ids.rst
@@ -0,0 +1,108 @@
+=========================
+Encryption IDs Controller
+=========================
+
+Overview
+========
+There are multiple hardware memory encryption capabilities provided by the
+hardware vendors, like Secure Encrypted Virtualization (SEV) and SEV Encrypted
+State (SEV-ES) from AMD.
+
+These features are being used in encrypting virtual machines (VMs) and user
+space programs. However, only a small number of keys/IDs can be used
+simultaneously.
+
+This limited availability of these IDs requires system admin to optimize
+allocation, control, and track the usage of the resources in the cloud
+infrastructure. This resource also needs to be protected from getting exhausted
+by some malicious program and causing starvation for other programs.
+
+Encryption IDs controller provides capability to register the resource for
+controlling and tracking through the cgroups.
+
+How to Enable Controller
+========================
+
+- Enable Encryption controller::
+
+        CONFIG_CGROUP_ENCRYPTION_IDS=y
+
+- Above options will build Encryption controller support in the kernel.
+  To mount the Encryption controller::
+
+        mount -t cgroup -o encryption none /sys/fs/cgroup/encryption
+
+
+Interface Files
+===============
+Each encryption ID type have their own interface files,
+encryption_id.[ID TYPE].{max, current, stat}, where "ID TYPE" can be sev and
+sev-es.
+
+  encryption_ids.[ID TYPE].stat
+        A read-only flat-keyed single value file. This file exists only in the
+        root cgroup.
+
+        It shows the total number of encryption IDs available and currently in
+        use on the platform::
+          # cat encryption.sev.stat
+          total 509
+          used 0
+
+  encryption_ids.[ID TYPE].max
+        A read-write file which exists on the non-root cgroups. File is used to
+        set maximum count of "[ID TYPE]" which can be used in the cgroup.
+
+        Limit can be set to max by::
+          # echo max > encryption.sev.max
+
+        Limit can be set by::
+          # echo 100 > encryption.sev.max
+
+        This file shows the max limit of the encryption ID in the cgroup::
+          # cat encryption.sev.max
+          max
+
+        OR::
+          # cat encryption.sev.max
+          100
+
+        Limits can be set more than the "total" capacity value in the
+        encryption_ids.[ID TYPE].stat file, however, the controller ensures
+        that the usage never exceeds the "total" and the max limit.
+
+  encryption_ids.[ID TYPE].current
+        A read-only single value file which exists on non-root cgroups.
+
+        Shows the total number of encrypted IDs being used in the cgroup.
+
+Hierarchy
+=========
+
+Encryption IDs controller supports hierarchical accounting. It supports
+following features:
+
+1. Current usage in the cgroup shows IDs used in the cgroup and its descendent cgroups.
+2. Current usage can never exceed the corresponding max limit set in the cgroup
+   and its ancestor's chain up to the root.
+
+Suppose the following example hierarchy::
+
+                        root
+                        /  \
+                       A    B
+                       |
+                       C
+
+1. A will show the count of IDs used in A and C.
+2. C's current IDs usage may not exceed any of the max limits set in C, A, or
+   root.
+
+Migration and ownership
+=======================
+
+An encryption ID is charged to the cgroup in which it is used first, and
+stays charged to that cgroup until that ID is freed. Migrating a process
+to a different cgroup do not move the charge to the destination cgroup
+where the process has moved.
+
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 608d7c279396..7938bb7c6e1c 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -63,8 +63,11 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgrou
        5-7-1. RDMA Interface Files
      5-8. HugeTLB
        5.8-1. HugeTLB Interface Files
-     5-8. Misc
-       5-8-1. perf_event
+     5-9. Encryption IDs
+       5.9-1 Encryption IDs Interface Files
+       5.9-2 Migration and Ownership
+     5-10. Misc
+       5-10-1. perf_event
      5-N. Non-normative information
        5-N-1. CPU controller root cgroup process behaviour
        5-N-2. IO controller root cgroup process behaviour
@@ -2149,6 +2152,77 @@ HugeTLB Interface Files
 	are local to the cgroup i.e. not hierarchical. The file modified event
 	generated on this file reflects only the local events.
 
+Encryption IDs
+--------------
+
+There are multiple hardware memory encryption capabilities provided by the
+hardware vendors, like Secure Encrypted Virtualization (SEV) and SEV Encrypted
+State (SEV-ES) from AMD.
+
+These features are being used in encrypting virtual machines (VMs) and user
+space programs. However, only a small number of keys/IDs can be used
+simultaneously.
+
+This limited availability of these IDs requires system admin to optimize
+allocation, control, and track the usage of the resources in the cloud
+infrastructure. This resource also needs to be protected from getting exhausted
+by some malicious program and causing starvation for other programs.
+
+Encryption IDs controller provides capability to register the resource for
+controlling and tracking through the cgroups.
+
+Encryption IDs Interface Files
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each encryption ID type have their own interface files,
+encryption_id.[ID TYPE].{max, current, stat}, where "ID TYPE" can be sev and
+sev-es.
+
+  encryption_ids.[ID TYPE].stat
+        A read-only flat-keyed single value file. This file exists only in the
+        root cgroup.
+
+        It shows the total number of encryption IDs available and currently in
+        use on the platform::
+          # cat encryption.sev.stat
+          total 509
+          used 0
+
+  encryption_ids.[ID TYPE].max
+        A read-write file which exists on the non-root cgroups. File is used to
+        set maximum count of "[ID TYPE]" which can be used in the cgroup.
+
+        Limit can be set to max by::
+          # echo max > encryption.sev.max
+
+        Limit can be set by::
+          # echo 100 > encryption.sev.max
+
+        This file shows the max limit of the encryption ID in the cgroup::
+          # cat encryption.sev.max
+          max
+
+        OR::
+          # cat encryption.sev.max
+          100
+
+        Limits can be set more than the "total" capacity value in the
+        encryption_ids.[ID TYPE].stat file, however, the controller ensures
+        that the usage never exceeds the "total" and the max limit.
+
+  encryption_ids.[ID TYPE].current
+        A read-only single value file which exists on non-root cgroups.
+
+        Shows the total number of encrypted IDs being used in the cgroup.
+
+Migration and Ownership
+~~~~~~~~~~~~~~~~~~~~~~~
+
+An encryption ID is charged to the cgroup in which it is used first, and
+stays charged to that cgroup until that ID is freed. Migrating a process
+to a different cgroup do not move the charge to the destination cgroup
+where the process has moved.
+
 Misc
 ----
 
-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
@ 2020-12-09 20:58   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2020-12-09 20:58 UTC (permalink / raw)
  To: Vipin Sharma
  Cc: thomas.lendacky, brijesh.singh, jon.grimm, eric.vantassell,
	pbonzini, seanjc, lizefan, hannes, frankja, borntraeger, corbet,
	joro, vkuznets, wanpengli, jmattson, tglx, mingo, bp, hpa,
	gingell, rientjes, dionnaglaze, kvm, x86, cgroups, linux-doc,
	linux-kernel

Hello,

Rough take after skimming:

* I don't have an overall objection. In terms of behavior, the only thing
  which stood out was input rejection depending on the current usage. The
  preferred way of handling that is rejecting future allocations rather than
  failing configuration as that makes it impossible e.g. to lower limit and
  drain existing usages from outside the container.

* However, the boilerplate to usefulness ratio doesn't look too good and I
  wonder whether what we should do is adding a generic "misc" controller
  which can host this sort of static hierarchical counting. I'll think more
  on it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
@ 2020-12-09 20:58   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2020-12-09 20:58 UTC (permalink / raw)
  To: Vipin Sharma
  Cc: thomas.lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	jon.grimm-5C7GfCeVMHo, eric.vantassell-5C7GfCeVMHo,
	pbonzini-H+wXaHxf7aLQT0dZR+AlfA, seanjc-hpIqsD4AKlfQT0dZR+AlfA,
	lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	frankja-tEXmvtCZX7AybS5Ee8rs3A,
	borntraeger-tA70FqPdS9bQT0dZR+AlfA, corbet-T1hC0tSOHrs,
	joro-zLv9SwRftAIdnm+yROfE0A, vkuznets-H+wXaHxf7aLQT0dZR+AlfA,
	wanpengli-1Nz4purKYjRBDgjK7y7TUQ,
	jmattson-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, bp-Gina5bIWoIWzQB+pC5nmwQ,
	hpa-YMNOUZJC4hwAvxtiuMwx3w, gingell-hpIqsD4AKlfQT0dZR+AlfA,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	dionnaglaze-hpIqsD4AKlfQT0dZR+AlfA, kvm-u79uwXL29TY76Z2rM5mHXA,
	x86-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-doc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hello,

Rough take after skimming:

* I don't have an overall objection. In terms of behavior, the only thing
  which stood out was input rejection depending on the current usage. The
  preferred way of handling that is rejecting future allocations rather than
  failing configuration as that makes it impossible e.g. to lower limit and
  drain existing usages from outside the container.

* However, the boilerplate to usefulness ratio doesn't look too good and I
  wonder whether what we should do is adding a generic "misc" controller
  which can host this sort of static hierarchical counting. I'll think more
  on it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
  2020-12-09 20:58   ` Tejun Heo
@ 2020-12-10 14:54     ` Christian Borntraeger
  -1 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2020-12-10 14:54 UTC (permalink / raw)
  To: Tejun Heo, Vipin Sharma
  Cc: thomas.lendacky, brijesh.singh, jon.grimm, eric.vantassell,
	pbonzini, seanjc, lizefan, hannes, frankja, corbet, joro,
	vkuznets, wanpengli, jmattson, tglx, mingo, bp, hpa, gingell,
	rientjes, dionnaglaze, kvm, x86, cgroups, linux-doc,
	linux-kernel

On 09.12.20 21:58, Tejun Heo wrote:
> Hello,
> 
> Rough take after skimming:
> 
> * I don't have an overall objection. In terms of behavior, the only thing
>   which stood out was input rejection depending on the current usage. The
>   preferred way of handling that is rejecting future allocations rather than
>   failing configuration as that makes it impossible e.g. to lower limit and
>   drain existing usages from outside the container.
> 
> * However, the boilerplate to usefulness ratio doesn't look too good and I
>   wonder whether what we should do is adding a generic "misc" controller
>   which can host this sort of static hierarchical counting. I'll think more
>   on it.

We first dicussed to have
encryption_ids.stat
encryption_ids.max
encryption_ids.current

and we added the sev in later, so that we can also have tdx, seid, sgx or whatever.
Maybe also 2 or more things at the same time.

Right now this code has

encryption_ids.sev.stat
encryption_ids.sev.max
encryption_ids.sev.current

And it would be trivial to extend it to have
encryption_ids.seid.stat
encryption_ids.seid.max
encryption_ids.seid.current
on s390 instead (for our secure guests).

So in the end this is almost already a misc controller, the only thing that we
need to change is the capability to also define things other than encryption.*.*
And of course we would need to avoid adding lots of random garbage to such a thing.

But if you feel ok with the burden to keep things kind of organized a misc
controller would certainly work for the encryption ID usecase as well. 
So I would be fine with the thing as is or a misc controlĺer.

Christian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
@ 2020-12-10 14:54     ` Christian Borntraeger
  0 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2020-12-10 14:54 UTC (permalink / raw)
  To: Tejun Heo, Vipin Sharma
  Cc: thomas.lendacky, brijesh.singh, jon.grimm, eric.vantassell,
	pbonzini, seanjc, lizefan, hannes, frankja, corbet, joro,
	vkuznets, wanpengli, jmattson, tglx, mingo, bp, hpa, gingell,
	rientjes, dionnaglaze, kvm, x86, cgroups, linux-doc,
	linux-kernel

On 09.12.20 21:58, Tejun Heo wrote:
> Hello,
> 
> Rough take after skimming:
> 
> * I don't have an overall objection. In terms of behavior, the only thing
>   which stood out was input rejection depending on the current usage. The
>   preferred way of handling that is rejecting future allocations rather than
>   failing configuration as that makes it impossible e.g. to lower limit and
>   drain existing usages from outside the container.
> 
> * However, the boilerplate to usefulness ratio doesn't look too good and I
>   wonder whether what we should do is adding a generic "misc" controller
>   which can host this sort of static hierarchical counting. I'll think more
>   on it.

We first dicussed to have
encryption_ids.stat
encryption_ids.max
encryption_ids.current

and we added the sev in later, so that we can also have tdx, seid, sgx or whatever.
Maybe also 2 or more things at the same time.

Right now this code has

encryption_ids.sev.stat
encryption_ids.sev.max
encryption_ids.sev.current

And it would be trivial to extend it to have
encryption_ids.seid.stat
encryption_ids.seid.max
encryption_ids.seid.current
on s390 instead (for our secure guests).

So in the end this is almost already a misc controller, the only thing that we
need to change is the capability to also define things other than encryption.*.*
And of course we would need to avoid adding lots of random garbage to such a thing.

But if you feel ok with the burden to keep things kind of organized a misc
controller would certainly work for the encryption ID usecase as well. 
So I would be fine with the thing as is or a misc controlĺer.

Christian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
  2020-12-10 14:54     ` Christian Borntraeger
  (?)
@ 2020-12-10 23:44     ` David Rientjes
  2020-12-16 15:27       ` Tejun Heo
  -1 siblings, 1 reply; 13+ messages in thread
From: David Rientjes @ 2020-12-10 23:44 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Tejun Heo, Vipin Sharma, thomas.lendacky, brijesh.singh,
	jon.grimm, eric.vantassell, pbonzini, seanjc, lizefan, hannes,
	frankja, corbet, joro, vkuznets, wanpengli, jmattson, tglx,
	mingo, bp, hpa, gingell, dionnaglaze, kvm, x86, cgroups,
	linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2149 bytes --]

On Thu, 10 Dec 2020, Christian Borntraeger wrote:

> > * However, the boilerplate to usefulness ratio doesn't look too good and I
> >   wonder whether what we should do is adding a generic "misc" controller
> >   which can host this sort of static hierarchical counting. I'll think more
> >   on it.
> 
> We first dicussed to have
> encryption_ids.stat
> encryption_ids.max
> encryption_ids.current
> 
> and we added the sev in later, so that we can also have tdx, seid, sgx or whatever.
> Maybe also 2 or more things at the same time.
> 
> Right now this code has
> 
> encryption_ids.sev.stat
> encryption_ids.sev.max
> encryption_ids.sev.current
> 
> And it would be trivial to extend it to have
> encryption_ids.seid.stat
> encryption_ids.seid.max
> encryption_ids.seid.current
> on s390 instead (for our secure guests).
> 
> So in the end this is almost already a misc controller, the only thing that we
> need to change is the capability to also define things other than encryption.*.*
> And of course we would need to avoid adding lots of random garbage to such a thing.
> 
> But if you feel ok with the burden to keep things kind of organized a misc
> controller would certainly work for the encryption ID usecase as well. 
> So I would be fine with the thing as is or a misc controlĺer.
> 

Yeah, I think generalization of this would come in the form of either (1) 
the dumping ground of an actual "misc" controller, that you elude to, or 
(2) a kernel abstraction so you can spin up your own generic controller 
that has the {current, max, stat} support.  In the case of the latter, 
encryption IDs becomes a user of that abstraction.

Concern with a single misc controller would be that any subsystem that 
wants to use it has to exactly fit this support: current, max, stat, 
nothing more.  The moment a controller needs some additional support, and 
its controller is already implemented in previous kernel versionv as a 
part of "misc," we face a problem.

On the other hand, a kernel abstraction that provides just the basic 
{current, max, stat} support might be interesting if it can be extended by 
the subsystem instance using it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
  2020-12-10 23:44     ` David Rientjes
@ 2020-12-16 15:27       ` Tejun Heo
  2020-12-16 20:02           ` Vipin Sharma
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2020-12-16 15:27 UTC (permalink / raw)
  To: David Rientjes
  Cc: Christian Borntraeger, Vipin Sharma, thomas.lendacky,
	brijesh.singh, jon.grimm, eric.vantassell, pbonzini, seanjc,
	lizefan, hannes, frankja, corbet, joro, vkuznets, wanpengli,
	jmattson, tglx, mingo, bp, hpa, gingell, dionnaglaze, kvm, x86,
	cgroups, linux-doc, linux-kernel

Hello,

On Thu, Dec 10, 2020 at 03:44:35PM -0800, David Rientjes wrote:
> Concern with a single misc controller would be that any subsystem that 
> wants to use it has to exactly fit this support: current, max, stat, 
> nothing more.  The moment a controller needs some additional support, and 
> its controller is already implemented in previous kernel versionv as a 
> part of "misc," we face a problem.

Yeah, that's a valid concern, but given the history, there doesn't seem to
be much need beyond that for these use cases and the limited need seems
inherent to the way the resources are defined and consumed. So yeah, it can
either way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
@ 2020-12-16 20:02           ` Vipin Sharma
  0 siblings, 0 replies; 13+ messages in thread
From: Vipin Sharma @ 2020-12-16 20:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Rientjes, Christian Borntraeger, Tom Lendacky, Brijesh,
	Jon, Eric, pbonzini, Sean Christopherson, lizefan, hannes,
	Janosch Frank, corbet, joro, vkuznets, wanpengli, Jim Mattson,
	tglx, mingo, bp, hpa, Matt Gingell, Dionna Glaze, kvm, x86,
	cgroups, linux-doc, linux-kernel

On Wed, Dec 9, 2020 at 12:59 PM Tejun Heo <tj@kernel.org> wrote:
> * I don't have an overall objection. In terms of behavior, the only thing
>   which stood out was input rejection depending on the current usage. The
>   preferred way of handling that is rejecting future allocations rather than
>   failing configuration as that makes it impossible e.g. to lower limit and
>   drain existing usages from outside the container.

Thanks. In next version of the patch I will remove rejection of max value
based on current usage.

On Wed, Dec 16, 2020 at 7:27 AM Tejun Heo <tj@kernel.org> wrote:
> On Thu, Dec 10, 2020 at 03:44:35PM -0800, David Rientjes wrote:
> > Concern with a single misc controller would be that any subsystem that
> > wants to use it has to exactly fit this support: current, max, stat,
> > nothing more.  The moment a controller needs some additional support, and
> > its controller is already implemented in previous kernel versionv as a
> > part of "misc," we face a problem.
>
> Yeah, that's a valid concern, but given the history, there doesn't seem to
> be much need beyond that for these use cases and the limited need seems
> inherent to the way the resources are defined and consumed. So yeah, it can
> either way.

I think a misc controller should be able support other "Resource Distribution
Models" mentioned in the cgroup v2 documentation besides limits. There might be
use cases in future which want to use weight, protection or allocation models.
If that happens it will be more difficult to support these different resources.
This will also mean the same hierarchy might get charged differently by the
same controller.

I like the idea of having a separate controller to keep the code simple and
easier for maintenance.

If you decide to have a separate misc controller please let me know what will
be the overall expectations and I can change my patch to reflect that,
otherwise I can send out a new patch with just removal of max input rejection.

My understanding with a "misc" controller is it will be something like
- cgroup v2
  cgroup/misc.encryption_ids.{sev, tdx, seid}.{stat, max, current}

- cgroup v1
  cgroup/misc/misc.encryption_ids.{sev, tdx, seid}.{stat, max, current}

Thanks
Vipin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
@ 2020-12-16 20:02           ` Vipin Sharma
  0 siblings, 0 replies; 13+ messages in thread
From: Vipin Sharma @ 2020-12-16 20:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Rientjes, Christian Borntraeger, Tom Lendacky, Brijesh,
	Jon, Eric, pbonzini-H+wXaHxf7aLQT0dZR+AlfA, Sean Christopherson,
	lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	Janosch Frank, corbet-T1hC0tSOHrs, joro-zLv9SwRftAIdnm+yROfE0A,
	vkuznets-H+wXaHxf7aLQT0dZR+AlfA,
	wanpengli-1Nz4purKYjRBDgjK7y7TUQ, Jim Mattson,
	tglx-hfZtesqFncYOwBW4kG4KsQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	bp-Gina5bIWoIWzQB+pC5nmwQ, hpa-YMNOUZJC4hwAvxtiuMwx3w,
	Matt Gingell, Dionna Glaze, kvm-u79uwXL29TY76Z2rM5mHXA,
	x86-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-doc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Dec 9, 2020 at 12:59 PM Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> * I don't have an overall objection. In terms of behavior, the only thing
>   which stood out was input rejection depending on the current usage. The
>   preferred way of handling that is rejecting future allocations rather than
>   failing configuration as that makes it impossible e.g. to lower limit and
>   drain existing usages from outside the container.

Thanks. In next version of the patch I will remove rejection of max value
based on current usage.

On Wed, Dec 16, 2020 at 7:27 AM Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu, Dec 10, 2020 at 03:44:35PM -0800, David Rientjes wrote:
> > Concern with a single misc controller would be that any subsystem that
> > wants to use it has to exactly fit this support: current, max, stat,
> > nothing more.  The moment a controller needs some additional support, and
> > its controller is already implemented in previous kernel versionv as a
> > part of "misc," we face a problem.
>
> Yeah, that's a valid concern, but given the history, there doesn't seem to
> be much need beyond that for these use cases and the limited need seems
> inherent to the way the resources are defined and consumed. So yeah, it can
> either way.

I think a misc controller should be able support other "Resource Distribution
Models" mentioned in the cgroup v2 documentation besides limits. There might be
use cases in future which want to use weight, protection or allocation models.
If that happens it will be more difficult to support these different resources.
This will also mean the same hierarchy might get charged differently by the
same controller.

I like the idea of having a separate controller to keep the code simple and
easier for maintenance.

If you decide to have a separate misc controller please let me know what will
be the overall expectations and I can change my patch to reflect that,
otherwise I can send out a new patch with just removal of max input rejection.

My understanding with a "misc" controller is it will be something like
- cgroup v2
  cgroup/misc.encryption_ids.{sev, tdx, seid}.{stat, max, current}

- cgroup v1
  cgroup/misc/misc.encryption_ids.{sev, tdx, seid}.{stat, max, current}

Thanks
Vipin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
  2020-12-16 20:02           ` Vipin Sharma
  (?)
@ 2021-01-05 15:36           ` Tejun Heo
  2021-01-06 18:45             ` Vipin Sharma
  -1 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2021-01-05 15:36 UTC (permalink / raw)
  To: Vipin Sharma
  Cc: David Rientjes, Christian Borntraeger, Tom Lendacky, Brijesh,
	Jon, Eric, pbonzini, Sean Christopherson, lizefan, hannes,
	Janosch Frank, corbet, joro, vkuznets, wanpengli, Jim Mattson,
	tglx, mingo, bp, hpa, Matt Gingell, Dionna Glaze, kvm, x86,
	cgroups, linux-doc, linux-kernel

Happy new year!

On Wed, Dec 16, 2020 at 12:02:37PM -0800, Vipin Sharma wrote:
> I like the idea of having a separate controller to keep the code simple and
> easier for maintenance.

Yeah, the more I think about it, keeping it separate seems like the right
thing to do. What bothers me primarily is that the internal logic is
identical between the RDMA controller and this one. If you wanna try
factoring them out into library, great. If not, I don't think it should
block merging this controller. We can get to refactoring later.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller
  2021-01-05 15:36           ` Tejun Heo
@ 2021-01-06 18:45             ` Vipin Sharma
  0 siblings, 0 replies; 13+ messages in thread
From: Vipin Sharma @ 2021-01-06 18:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Rientjes, Christian Borntraeger, Tom Lendacky, Brijesh,
	Jon, Eric, pbonzini, Sean Christopherson, lizefan, hannes,
	Janosch Frank, corbet, joro, vkuznets, wanpengli, Jim Mattson,
	tglx, mingo, bp, hpa, Matt Gingell, Dionna Glaze, kvm, x86,
	cgroups, linux-doc, linux-kernel

On Tue, Jan 05, 2021 at 10:36:40AM -0500, Tejun Heo wrote:
> Happy new year!
> 
> On Wed, Dec 16, 2020 at 12:02:37PM -0800, Vipin Sharma wrote:
> > I like the idea of having a separate controller to keep the code simple and
> > easier for maintenance.
> 
> Yeah, the more I think about it, keeping it separate seems like the right
> thing to do. What bothers me primarily is that the internal logic is
> identical between the RDMA controller and this one. If you wanna try
> factoring them out into library, great. If not, I don't think it should
> block merging this controller. We can get to refactoring later.
> 

Happy new year!
Sounds great, I will send out a new patch which will not reject the new
max limit based on the current usage. It will not include refactoring
out common code between RDMA and Encryption ID controller. We can pursue
that later.

Thanks
Vipin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-06 18:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09 20:54 [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller Vipin Sharma
2020-12-09 20:54 ` [Patch v3 1/2] cgroup: svm: Add Encryption ID controller Vipin Sharma
2020-12-09 20:54 ` [Patch v3 2/2] cgroup: svm: Encryption IDs cgroup documentation Vipin Sharma
2020-12-09 20:58 ` [Patch v3 0/2] cgroup: KVM: New Encryption IDs cgroup controller Tejun Heo
2020-12-09 20:58   ` Tejun Heo
2020-12-10 14:54   ` Christian Borntraeger
2020-12-10 14:54     ` Christian Borntraeger
2020-12-10 23:44     ` David Rientjes
2020-12-16 15:27       ` Tejun Heo
2020-12-16 20:02         ` Vipin Sharma
2020-12-16 20:02           ` Vipin Sharma
2021-01-05 15:36           ` Tejun Heo
2021-01-06 18:45             ` Vipin Sharma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.