All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/1] Introduce new attribute "priority" to control group
@ 2021-04-04 14:51 yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
       [not found] ` <cover.1617355387.git.yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w @ 2021-04-04 14:51 UTC (permalink / raw)
  To: tj-DgEjT+Ai2ygdnm+yROfE0A, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	hannes-druUgvl0LCNAfugRpC6u6w, christian-STijNZzMWpgWenYVfaLwtA
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, benbjiang-1Nz4purKYjRBDgjK7y7TUQ,
	kernellwp-Re5JQEeQqe8AvxtiuMwx3w,
	lihaiwei.kernel-Re5JQEeQqe8AvxtiuMwx3w,
	linussli-1Nz4purKYjRBDgjK7y7TUQ,
	herberthbli-1Nz4purKYjRBDgjK7y7TUQ,
	lennychen-1Nz4purKYjRBDgjK7y7TUQ,
	allanyuliu-1Nz4purKYjRBDgjK7y7TUQ, Yulei Zhang

From: Yulei Zhang <yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>

This patch is the init patch of a series which we want to present the idea
of prioritized tasks management. As the cloud computing introduces intricate
configurations to provide customized infrasturctures and friendly user
experiences, in order to maximum utilization of sources and improve the
efficiency of arrangement, we add the new attribute "priority" to control
group, which could be used as graded factor by subssystems to manipulate
the behaviors of processes.

Base on the order of priority, we could apply different resource configuration
strategies, sometimes it will be more accuracy instead of fine tuning in each
subsystem. And of course to set fundamental rules, for example, high priority
cgroups could seize the resource from cgroups with lower priority all the time.

The default value of "priority" is set to 0 which means the highest
priority, and the totally levels of priority is defined by
CGROUP_PRIORITY_MAX. Each subsystem could register callback to receive the
priority change notification for their own purposes. 

We would like to send out the corresponding features in the coming weeks,
which are relaying on the priority settings. For example, the prioritized
oom, memory reclaiming and cpu schedule strategy.

Lei Chen (1):
  cgroup: add support for cgroup priority

 include/linux/cgroup-defs.h |  2 +
 include/linux/cgroup.h      |  2 +
 kernel/cgroup/cgroup.c      | 90 +++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC 1/1] cgroup: add support for cgroup priority
       [not found] ` <cover.1617355387.git.yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
@ 2021-04-04 14:52   ` yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
  2021-04-04 17:46     ` kernel test robot
  2021-04-04 17:29   ` [RFC 0/1] Introduce new attribute "priority" to control group Tejun Heo
  1 sibling, 1 reply; 6+ messages in thread
From: yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w @ 2021-04-04 14:52 UTC (permalink / raw)
  To: tj-DgEjT+Ai2ygdnm+yROfE0A, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	hannes-druUgvl0LCNAfugRpC6u6w, christian-STijNZzMWpgWenYVfaLwtA
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, benbjiang-1Nz4purKYjRBDgjK7y7TUQ,
	kernellwp-Re5JQEeQqe8AvxtiuMwx3w,
	lihaiwei.kernel-Re5JQEeQqe8AvxtiuMwx3w,
	linussli-1Nz4purKYjRBDgjK7y7TUQ,
	herberthbli-1Nz4purKYjRBDgjK7y7TUQ,
	lennychen-1Nz4purKYjRBDgjK7y7TUQ,
	allanyuliu-1Nz4purKYjRBDgjK7y7TUQ, Peng Zhiguang, Yulei Zhang

From: Lei Chen <lennychen-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>

Introduce new attribute "priority" to control group, which
could be used as scale by subssytem to manipulate the behaviors
of processes.
The default value of "priority" is set to 0 which means the
highest priority, and the totally levels of priority is defined
by CGROUP_PRIORITY_MAX.

Signed-off-by: Lei Chen <lennychen-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Liu Yu <allanyuliu-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Peng Zhiguang <zgpeng-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Yulei Zhang <yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
---
 include/linux/cgroup-defs.h |  2 +
 include/linux/cgroup.h      |  2 +
 kernel/cgroup/cgroup.c      | 90 +++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 559ee05f8..3fa2f28a9 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -417,6 +417,7 @@ struct cgroup {
 	u16 subtree_ss_mask;
 	u16 old_subtree_control;
 	u16 old_subtree_ss_mask;
+	u16 priority;
 
 	/* Private pointers for each registered subsystem */
 	struct cgroup_subsys_state __rcu *subsys[CGROUP_SUBSYS_COUNT];
@@ -640,6 +641,7 @@ struct cgroup_subsys {
 	void (*exit)(struct task_struct *task);
 	void (*release)(struct task_struct *task);
 	void (*bind)(struct cgroup_subsys_state *root_css);
+	int (*css_priority_change)(struct cgroup_subsys_state *css, u16 old, u16 new);
 
 	bool early_init:1;
 
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 4f2f79de0..734d51aba 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -47,6 +47,7 @@ struct kernel_clone_args;
 
 /* internal flags */
 #define CSS_TASK_ITER_SKIPPED		(1U << 16)
+#define CGROUP_PRIORITY_MAX		8
 
 /* a css_task_iter should be treated as an opaque object */
 struct css_task_iter {
@@ -957,5 +958,6 @@ static inline void cgroup_bpf_get(struct cgroup *cgrp) {}
 static inline void cgroup_bpf_put(struct cgroup *cgrp) {}
 
 #endif /* CONFIG_CGROUP_BPF */
+ssize_t cgroup_priority(struct cgroup_subsys_state *css);
 
 #endif /* _LINUX_CGROUP_H */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 9153b20e5..dcb057e42 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1892,6 +1892,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 	cgrp->dom_cgrp = cgrp;
 	cgrp->max_descendants = INT_MAX;
 	cgrp->max_depth = INT_MAX;
+	cgrp->priority = 0;
 	INIT_LIST_HEAD(&cgrp->rstat_css_list);
 	prev_cputime_init(&cgrp->prev_cputime);
 
@@ -4783,6 +4784,88 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
 	return __cgroup_procs_write(of, buf, false) ?: nbytes;
 }
 
+static int cgroup_priority_show(struct seq_file *seq, void *v)
+{
+	struct cgroup *cgrp = seq_css(seq)->cgroup;
+	u16 prio = cgrp->priority;
+
+	seq_printf(seq, "%d\n", prio);
+
+	return 0;
+}
+
+static void cgroup_set_priority(struct cgroup *cgrp, unsigned int priority)
+{
+	u16 old = cgrp->priority;
+	struct cgroup_subsys_state *css;
+	int ssid;
+
+	cgrp->priority = priority;
+	for_each_css(css, ssid, cgrp) {
+		if (css->ss->css_priority_change)
+			css->ss->css_priority_change(css, old, priority);
+	}
+}
+
+static void cgroup_priority_propagate(struct cgroup *cgrp)
+{
+	struct cgroup *dsct;
+	struct cgroup_subsys_state *d_css;
+	u16 priority = cgrp->priority;
+
+	lockdep_assert_held(&cgroup_mutex);
+	cgroup_for_each_live_descendant_pre(dsct, d_css, cgrp) {
+		if (dsct->priority < priority)
+			cgroup_set_priority(dsct, priority);
+	}
+}
+
+static ssize_t cgroup_priority_write(struct kernfs_open_file *of,
+				      char *buf, size_t nbytes, loff_t off)
+{
+	struct cgroup *cgrp, *parent;
+	ssize_t ret;
+	u16 prio, orig;
+
+	buf = strstrip(buf);
+	ret = kstrtoint(buf, 0, &prio);
+	if (ret)
+		return ret;
+
+	if (prio < 0 || prio >= CGROUP_PRIORITY_MAX)
+		return -ERANGE;
+
+	cgrp = cgroup_kn_lock_live(of->kn, false);
+	if (!cgrp)
+		return -ENOENT;
+	parent = cgroup_parent(cgrp);
+	if (parent && prio < parent->priority) {
+		ret = -EINVAL;
+		goto unlock_out;
+	}
+	orig = cgrp->priority;
+	if (prio == orig)
+		goto unlock_out;
+
+	cgroup_set_priority(cgrp, prio);
+	cgroup_priority_propagate(cgrp);
+unlock_out:
+	cgroup_kn_unlock(of->kn);
+
+	return ret ?: nbytes;
+}
+
+ssize_t cgroup_priority(struct cgroup_subsys_state *css)
+{
+	struct cgroup *cgrp = css->cgroup;
+	unsigned int prio = 0;
+
+	if (cgrp)
+		prio = cgrp->priority;
+	return prio;
+}
+EXPORT_SYMBOL(cgroup_priority);
+
 /* cgroup core interface files for the default hierarchy */
 static struct cftype cgroup_base_files[] = {
 	{
@@ -4836,6 +4919,12 @@ static struct cftype cgroup_base_files[] = {
 		.seq_show = cgroup_max_depth_show,
 		.write = cgroup_max_depth_write,
 	},
+	{
+		.name = "cgroup.priority",
+		.flags = CFTYPE_NOT_ON_ROOT,
+		.seq_show = cgroup_priority_show,
+		.write = cgroup_priority_write,
+	},
 	{
 		.name = "cgroup.stat",
 		.seq_show = cgroup_stat_show,
@@ -5178,6 +5267,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
 	cgrp->self.parent = &parent->self;
 	cgrp->root = root;
 	cgrp->level = level;
+	cgrp->priority = parent->priority;
 
 	ret = psi_cgroup_alloc(cgrp);
 	if (ret)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC 0/1] Introduce new attribute "priority" to control group
       [not found] ` <cover.1617355387.git.yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
  2021-04-04 14:52   ` [RFC 1/1] cgroup: add support for cgroup priority yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
@ 2021-04-04 17:29   ` Tejun Heo
  2021-04-08 16:49     ` yulei zhang
  1 sibling, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2021-04-04 17:29 UTC (permalink / raw)
  To: yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
  Cc: lizefan.x-EC8Uxl6Npydl57MIdRCFDg, hannes-druUgvl0LCNAfugRpC6u6w,
	christian-STijNZzMWpgWenYVfaLwtA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	benbjiang-1Nz4purKYjRBDgjK7y7TUQ,
	kernellwp-Re5JQEeQqe8AvxtiuMwx3w,
	lihaiwei.kernel-Re5JQEeQqe8AvxtiuMwx3w,
	linussli-1Nz4purKYjRBDgjK7y7TUQ,
	herberthbli-1Nz4purKYjRBDgjK7y7TUQ,
	lennychen-1Nz4purKYjRBDgjK7y7TUQ,
	allanyuliu-1Nz4purKYjRBDgjK7y7TUQ, Yulei Zhang

On Sun, Apr 04, 2021 at 10:51:53PM +0800, yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> From: Yulei Zhang <yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
> 
> This patch is the init patch of a series which we want to present the idea
> of prioritized tasks management. As the cloud computing introduces intricate
> configurations to provide customized infrasturctures and friendly user
> experiences, in order to maximum utilization of sources and improve the
> efficiency of arrangement, we add the new attribute "priority" to control
> group, which could be used as graded factor by subssystems to manipulate
> the behaviors of processes.
> 
> Base on the order of priority, we could apply different resource configuration
> strategies, sometimes it will be more accuracy instead of fine tuning in each
> subsystem. And of course to set fundamental rules, for example, high priority
> cgroups could seize the resource from cgroups with lower priority all the time.
> 
> The default value of "priority" is set to 0 which means the highest
> priority, and the totally levels of priority is defined by
> CGROUP_PRIORITY_MAX. Each subsystem could register callback to receive the
> priority change notification for their own purposes. 
> 
> We would like to send out the corresponding features in the coming weeks,
> which are relaying on the priority settings. For example, the prioritized
> oom, memory reclaiming and cpu schedule strategy.

We've been trying really hard to give each interface semantics which is
logical and describable independent of the implementation details. This runs
precisely against that.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC 1/1] cgroup: add support for cgroup priority
  2021-04-04 14:52   ` [RFC 1/1] cgroup: add support for cgroup priority yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
@ 2021-04-04 17:46     ` kernel test robot
  0 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2021-04-04 17:46 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 10386 bytes --]

Hi,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on cgroup/for-next]
[also build test ERROR on v5.12-rc5 next-20210401]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/yulei-kernel-gmail-com/Introduce-new-attribute-priority-to-control-group/20210404-225313
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
config: s390-randconfig-r025-20210404 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 30df6d5d6a8537d3ec7d8fe4299289a4c5a74d5c)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/0day-ci/linux/commit/8000447113bfea873921c098f0033aba76d206ae
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review yulei-kernel-gmail-com/Introduce-new-attribute-priority-to-control-group/20210404-225313
        git checkout 8000447113bfea873921c098f0033aba76d206ae
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:20:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x0000ff00UL) <<  8) |            \
                     ^
   In file included from kernel/cgroup/cgroup.c:35:
   In file included from include/linux/init_task.h:18:
   In file included from include/net/net_namespace.h:39:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:21:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x00ff0000UL) >>  8) |            \
                     ^
   In file included from kernel/cgroup/cgroup.c:35:
   In file included from include/linux/init_task.h:18:
   In file included from include/net/net_namespace.h:39:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:22:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0xff000000UL) >> 24)))
                     ^
   In file included from kernel/cgroup/cgroup.c:35:
   In file included from include/linux/init_task.h:18:
   In file included from include/net/net_namespace.h:39:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:120:12: note: expanded from macro '__swab32'
           __fswab32(x))
                     ^
   In file included from kernel/cgroup/cgroup.c:35:
   In file included from include/linux/init_task.h:18:
   In file included from include/net/net_namespace.h:39:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> kernel/cgroup/cgroup.c:4831:26: error: incompatible pointer types passing 'u16 *' (aka 'unsigned short *') to parameter of type 'int *' [-Werror,-Wincompatible-pointer-types]
           ret = kstrtoint(buf, 0, &prio);
                                   ^~~~~
   include/linux/kernel.h:245:67: note: passing argument to parameter 'res' here
   int __must_check kstrtoint(const char *s, unsigned int base, int *res);
                                                                     ^
   20 warnings and 1 error generated.


vim +4831 kernel/cgroup/cgroup.c

  4822	
  4823	static ssize_t cgroup_priority_write(struct kernfs_open_file *of,
  4824					      char *buf, size_t nbytes, loff_t off)
  4825	{
  4826		struct cgroup *cgrp, *parent;
  4827		ssize_t ret;
  4828		u16 prio, orig;
  4829	
  4830		buf = strstrip(buf);
> 4831		ret = kstrtoint(buf, 0, &prio);
  4832		if (ret)
  4833			return ret;
  4834	
  4835		if (prio < 0 || prio >= CGROUP_PRIORITY_MAX)
  4836			return -ERANGE;
  4837	
  4838		cgrp = cgroup_kn_lock_live(of->kn, false);
  4839		if (!cgrp)
  4840			return -ENOENT;
  4841		parent = cgroup_parent(cgrp);
  4842		if (parent && prio < parent->priority) {
  4843			ret = -EINVAL;
  4844			goto unlock_out;
  4845		}
  4846		orig = cgrp->priority;
  4847		if (prio == orig)
  4848			goto unlock_out;
  4849	
  4850		cgroup_set_priority(cgrp, prio);
  4851		cgroup_priority_propagate(cgrp);
  4852	unlock_out:
  4853		cgroup_kn_unlock(of->kn);
  4854	
  4855		return ret ?: nbytes;
  4856	}
  4857	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 23787 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC 0/1] Introduce new attribute "priority" to control group
  2021-04-04 17:29   ` [RFC 0/1] Introduce new attribute "priority" to control group Tejun Heo
@ 2021-04-08 16:49     ` yulei zhang
       [not found]       ` <CACZOiM0xA+6kAeM2sk3SfVV9Vu+5dOzC7APoNmB0Zw3jQKbg+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: yulei zhang @ 2021-04-08 16:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan.x-EC8Uxl6Npydl57MIdRCFDg, hannes-druUgvl0LCNAfugRpC6u6w,
	christian-STijNZzMWpgWenYVfaLwtA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	benbjiang-1Nz4purKYjRBDgjK7y7TUQ, Wanpeng Li, Haiwei Li,
	linussli-1Nz4purKYjRBDgjK7y7TUQ,
	herberthbli-1Nz4purKYjRBDgjK7y7TUQ,
	lennychen-1Nz4purKYjRBDgjK7y7TUQ,
	allanyuliu-1Nz4purKYjRBDgjK7y7TUQ, Yulei Zhang

On Mon, Apr 5, 2021 at 1:29 AM Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>
> On Sun, Apr 04, 2021 at 10:51:53PM +0800, yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> > From: Yulei Zhang <yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
> >
> > This patch is the init patch of a series which we want to present the idea
> > of prioritized tasks management. As the cloud computing introduces intricate
> > configurations to provide customized infrasturctures and friendly user
> > experiences, in order to maximum utilization of sources and improve the
> > efficiency of arrangement, we add the new attribute "priority" to control
> > group, which could be used as graded factor by subssystems to manipulate
> > the behaviors of processes.
> >
> > Base on the order of priority, we could apply different resource configuration
> > strategies, sometimes it will be more accuracy instead of fine tuning in each
> > subsystem. And of course to set fundamental rules, for example, high priority
> > cgroups could seize the resource from cgroups with lower priority all the time.
> >
> > The default value of "priority" is set to 0 which means the highest
> > priority, and the totally levels of priority is defined by
> > CGROUP_PRIORITY_MAX. Each subsystem could register callback to receive the
> > priority change notification for their own purposes.
> >
> > We would like to send out the corresponding features in the coming weeks,
> > which are relaying on the priority settings. For example, the prioritized
> > oom, memory reclaiming and cpu schedule strategy.
>
> We've been trying really hard to give each interface semantics which is
> logical and describable independent of the implementation details. This runs
> precisely against that.
>
> Thanks.
>
> --
> tejun

Thanks for the feedback. I am afraid that I didn't express myself clearly
about the idea of the priority attribute. We don't want to overwrite
the semantics
for each interface in cgroup, just hope to introduce another factor that could
help us apply the management strategy. For example, In our production
environment
K8s has its own priority class to implement the Qos, and it will be
very helpful
if control group could provide corresponding priority to assist the
implementation.
-Yulei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC 0/1] Introduce new attribute "priority" to control group
       [not found]       ` <CACZOiM0xA+6kAeM2sk3SfVV9Vu+5dOzC7APoNmB0Zw3jQKbg+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2021-04-08 20:58         ` Shakeel Butt
  0 siblings, 0 replies; 6+ messages in thread
From: Shakeel Butt @ 2021-04-08 20:58 UTC (permalink / raw)
  To: yulei zhang
  Cc: Tejun Heo, Zefan Li, Johannes Weiner, Christian Brauner, Cgroups,
	benbjiang-1Nz4purKYjRBDgjK7y7TUQ, Wanpeng Li, Haiwei Li,
	linussli-1Nz4purKYjRBDgjK7y7TUQ,
	herberthbli-1Nz4purKYjRBDgjK7y7TUQ,
	lennychen-1Nz4purKYjRBDgjK7y7TUQ,
	allanyuliu-1Nz4purKYjRBDgjK7y7TUQ, Yulei Zhang

On Thu, Apr 8, 2021 at 9:51 AM yulei zhang <yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> On Mon, Apr 5, 2021 at 1:29 AM Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> >
> > On Sun, Apr 04, 2021 at 10:51:53PM +0800, yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> > > From: Yulei Zhang <yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
> > >
> > > This patch is the init patch of a series which we want to present the idea
> > > of prioritized tasks management. As the cloud computing introduces intricate
> > > configurations to provide customized infrasturctures and friendly user
> > > experiences, in order to maximum utilization of sources and improve the
> > > efficiency of arrangement, we add the new attribute "priority" to control
> > > group, which could be used as graded factor by subssystems to manipulate
> > > the behaviors of processes.
> > >
> > > Base on the order of priority, we could apply different resource configuration
> > > strategies, sometimes it will be more accuracy instead of fine tuning in each
> > > subsystem. And of course to set fundamental rules, for example, high priority
> > > cgroups could seize the resource from cgroups with lower priority all the time.
> > >
> > > The default value of "priority" is set to 0 which means the highest
> > > priority, and the totally levels of priority is defined by
> > > CGROUP_PRIORITY_MAX. Each subsystem could register callback to receive the
> > > priority change notification for their own purposes.
> > >
> > > We would like to send out the corresponding features in the coming weeks,
> > > which are relaying on the priority settings. For example, the prioritized
> > > oom, memory reclaiming and cpu schedule strategy.
> >
> > We've been trying really hard to give each interface semantics which is
> > logical and describable independent of the implementation details. This runs
> > precisely against that.
> >
> > Thanks.
> >
> > --
> > tejun
>
> Thanks for the feedback. I am afraid that I didn't express myself clearly
> about the idea of the priority attribute. We don't want to overwrite
> the semantics
> for each interface in cgroup, just hope to introduce another factor that could
> help us apply the management strategy. For example, In our production
> environment
> K8s has its own priority class to implement the Qos, and it will be
> very helpful
> if control group could provide corresponding priority to assist the
> implementation.

IMHO the 'priority' is a high level concept mainly to define policies
and should not be part of user API. The meaning of priority changes
with each use-case and it can be more than one dimension. For example
I can have two jobs running on a machine. One is a batch and the
second is a latency sensitive job, let's say web server. For oom-kill
use-case, I might prefer to kill the web server as there will be
multiple instances running and the load balancer will redirect the
load. However for memory reclaim, I would prefer to reclaim from batch
as the cost of refaults in the web server is visible to end customers.

Basically we should have mechanisms in the kernel which can be used to
define and enforce high level priorities. I can set memory.low of the
web server to prefer reclaim from batch job. For oom-kill, Android's
lmkd and fb's oomd are the examples of user space oom-killer where
policies are in user space.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-08 20:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-04 14:51 [RFC 0/1] Introduce new attribute "priority" to control group yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
     [not found] ` <cover.1617355387.git.yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
2021-04-04 14:52   ` [RFC 1/1] cgroup: add support for cgroup priority yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w
2021-04-04 17:46     ` kernel test robot
2021-04-04 17:29   ` [RFC 0/1] Introduce new attribute "priority" to control group Tejun Heo
2021-04-08 16:49     ` yulei zhang
     [not found]       ` <CACZOiM0xA+6kAeM2sk3SfVV9Vu+5dOzC7APoNmB0Zw3jQKbg+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-04-08 20:58         ` Shakeel Butt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.