All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] psi: add support for multi level pressure stall trigger
@ 2022-05-16  3:35 Chen Wandun
  2022-05-16  3:35 ` [PATCH 2/2] psi: add description about multi level pressure trigger Chen Wandun
  2022-05-16  6:20 ` [PATCH 1/2] psi: add support for multi level pressure stall trigger Alex Shi
  0 siblings, 2 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-16  3:35 UTC (permalink / raw)
  To: linux-kernel, hannes, surenb, alexs, corbet, linux-doc

Nowadays, psi events are triggered when stall time exceed
stall threshold, but no any different between these events.

Actually, events can be divide into multi level, each level
represent a different stall pressure, that is help to identify
pressure information more accurately.

echo "some 150000 350000 1000000" > /proc/pressure/memory would
add [150ms, 350ms) threshold for partial memory stall measured
within 1sec time window.

Signed-off-by: Chen Wandun <chenwandun@huawei.com>
---
 include/linux/psi_types.h |  3 ++-
 kernel/sched/psi.c        | 19 +++++++++++++------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
index c7fe7c089718..2b1393c8bf90 100644
--- a/include/linux/psi_types.h
+++ b/include/linux/psi_types.h
@@ -119,7 +119,8 @@ struct psi_trigger {
 	enum psi_states state;
 
 	/* User-spacified threshold in ns */
-	u64 threshold;
+	u64 min_threshold;
+	u64 max_threshold;
 
 	/* List node inside triggers list */
 	struct list_head node;
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 6f9533c95b0a..17dd233b533a 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
 
 			/* Calculate growth since last update */
 			growth = window_update(&t->win, now, total[t->state]);
-			if (growth < t->threshold)
+			if (growth < t->min_threshold || growth >= t->max_threshold)
 				continue;
 
 			t->pending_event = true;
@@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
 {
 	struct psi_trigger *t;
 	enum psi_states state;
-	u32 threshold_us;
+	u32 min_threshold_us;
+	u32 max_threshold_us;
 	u32 window_us;
 
 	if (static_branch_likely(&psi_disabled))
 		return ERR_PTR(-EOPNOTSUPP);
 
-	if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
+	if (sscanf(buf, "some %u %u %u", &min_threshold_us,
+				&max_threshold_us, &window_us) == 3)
 		state = PSI_IO_SOME + res * 2;
-	else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
+	else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
+				&max_threshold_us, &window_us) == 3)
 		state = PSI_IO_FULL + res * 2;
 	else
 		return ERR_PTR(-EINVAL);
@@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
 		window_us > WINDOW_MAX_US)
 		return ERR_PTR(-EINVAL);
 
+	if (min_threshold_us >= max_threshold_us)
+		return ERR_PTR(-EINVAL);
+
 	/* Check threshold */
-	if (threshold_us == 0 || threshold_us > window_us)
+	if (max_threshold_us > window_us)
 		return ERR_PTR(-EINVAL);
 
 	t = kmalloc(sizeof(*t), GFP_KERNEL);
@@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
 
 	t->group = group;
 	t->state = state;
-	t->threshold = threshold_us * NSEC_PER_USEC;
+	t->min_threshold = min_threshold_us * NSEC_PER_USEC;
+	t->max_threshold = max_threshold_us * NSEC_PER_USEC;
 	t->win.size = window_us * NSEC_PER_USEC;
 	window_reset(&t->win, 0, 0, 0);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] psi: add description about multi level pressure trigger
  2022-05-16  3:35 [PATCH 1/2] psi: add support for multi level pressure stall trigger Chen Wandun
@ 2022-05-16  3:35 ` Chen Wandun
  2022-05-16  6:20 ` [PATCH 1/2] psi: add support for multi level pressure stall trigger Alex Shi
  1 sibling, 0 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-16  3:35 UTC (permalink / raw)
  To: linux-kernel, hannes, surenb, alexs, corbet, linux-doc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 3709 bytes --]

add description about multi level pressure tirgger in documentation
about psi.

Signed-off-by: Chen Wandun <chenwandun@huawei.com>
---
 Documentation/accounting/psi.rst                    | 12 ++++++------
 Documentation/translations/zh_CN/accounting/psi.rst | 11 ++++++-----
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/Documentation/accounting/psi.rst b/Documentation/accounting/psi.rst
index 860fe651d645..7abfe4fff201 100644
--- a/Documentation/accounting/psi.rst
+++ b/Documentation/accounting/psi.rst
@@ -81,12 +81,12 @@ desired threshold and time window. The open file descriptor should be
 used to wait for trigger events using select(), poll() or epoll().
 The following format is used::
 
-	<some|full> <stall amount in us> <time window in us>
+<some|full> <min stall amount in us> <max stall amount in us> <time window in us>
 
-For example writing "some 150000 1000000" into /proc/pressure/memory
-would add 150ms threshold for partial memory stall measured within
-1sec time window. Writing "full 50000 1000000" into /proc/pressure/io
-would add 50ms threshold for full io stall measured within 1sec time window.
+For example writing "some 150000 350000 1000000" into /proc/pressure/memory
+would add 150ms ~ 350ms threshold for partial memory stall measured within
+1sec time window. Writing "full 50000 70000 1000000" into /proc/pressure/io
+would add 50ms ~ 70ms threshold for full io stall measured within 1sec time window.
 
 Triggers can be set on more than one psi metric and more than one trigger
 for the same psi metric can be specified. However for each trigger a separate
@@ -132,7 +132,7 @@ Userspace monitor usage example
    * and 150ms threshold.
    */
   int main() {
-	const char trig[] = "some 150000 1000000";
+	const char trig[] = "some 150000 350000 1000000";
 	struct pollfd fds;
 	int n;
 
diff --git a/Documentation/translations/zh_CN/accounting/psi.rst b/Documentation/translations/zh_CN/accounting/psi.rst
index a0ddb7bd257c..9e8f6467e034 100644
--- a/Documentation/translations/zh_CN/accounting/psi.rst
+++ b/Documentation/translations/zh_CN/accounting/psi.rst
@@ -69,11 +69,12 @@ avg代表阻塞时间占比(百分比),为最近10秒、60秒、300秒内
 所打开的文件描述符用于等待事件,可使用select()、poll()、epoll()。
 写入信息的格式如下:
 
-        <some|full> <stall amount in us> <time window in us>
+        <some|full> <min stall amount in us> <max stall amount in us> <time window in us>
 
-示例:向/proc/pressure/memory写入"some 150000 1000000"将新增触发器,将在
-1秒内至少一个任务阻塞于内存的总时间超过150ms时触发。向/proc/pressure/io写入
-"full 50000 1000000"将新增触发器,将在1秒内所有任务都阻塞于io的总时间超过50ms时触发。
+示例:向/proc/pressure/memory写入"some 150000 350000, 1000000"将新增触发器,将在
+1秒内至少一个任务阻塞于内存的总时间在150ms ~ 350ms时触发。向/proc/pressure/io写入
+"full 50000 70000 1000000"将新增触发器,将在1秒内所有任务都阻塞于io的总时间在50ms ~ 70ms
+时触发。
 
 触发器可针对多个psi度量值设置,同一个psi度量值可设置多个触发器。每个触发器需要
 单独的文件描述符用于轮询,以区分于其他触发器。所以即使对于同一个psi接口文件,
@@ -105,7 +106,7 @@ psi接口提供的均值即可。
 
   /* 监控内存部分阻塞,监控时间窗口为1秒、阻塞门限为150毫秒。*/
   int main() {
-        const char trig[] = "some 150000 1000000";
+        const char trig[] = "some 150000 350000 1000000";
         struct pollfd fds;
         int n;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-16  3:35 [PATCH 1/2] psi: add support for multi level pressure stall trigger Chen Wandun
  2022-05-16  3:35 ` [PATCH 2/2] psi: add description about multi level pressure trigger Chen Wandun
@ 2022-05-16  6:20 ` Alex Shi
  2022-05-16  8:21   ` Suren Baghdasaryan
  2022-05-17  9:08   ` Chen Wandun
  1 sibling, 2 replies; 17+ messages in thread
From: Alex Shi @ 2022-05-16  6:20 UTC (permalink / raw)
  To: Chen Wandun, linux-kernel, hannes, surenb, alexs, corbet, linux-doc



On 5/16/22 11:35, Chen Wandun wrote:
> Nowadays, psi events are triggered when stall time exceed
> stall threshold, but no any different between these events.
> 
> Actually, events can be divide into multi level, each level
> represent a different stall pressure, that is help to identify
> pressure information more accurately.
> 
> echo "some 150000 350000 1000000" > /proc/pressure/memory would

This breaks the old ABI. And why you need this new function?

Thanks

> add [150ms, 350ms) threshold for partial memory stall measured
> within 1sec time window.
> 
> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
> ---
>  include/linux/psi_types.h |  3 ++-
>  kernel/sched/psi.c        | 19 +++++++++++++------
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
> index c7fe7c089718..2b1393c8bf90 100644
> --- a/include/linux/psi_types.h
> +++ b/include/linux/psi_types.h
> @@ -119,7 +119,8 @@ struct psi_trigger {
>  	enum psi_states state;
>  
>  	/* User-spacified threshold in ns */
> -	u64 threshold;
> +	u64 min_threshold;
> +	u64 max_threshold;
>  
>  	/* List node inside triggers list */
>  	struct list_head node;
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 6f9533c95b0a..17dd233b533a 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
>  
>  			/* Calculate growth since last update */
>  			growth = window_update(&t->win, now, total[t->state]);
> -			if (growth < t->threshold)
> +			if (growth < t->min_threshold || growth >= t->max_threshold)
>  				continue;
>  
>  			t->pending_event = true;
> @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>  {
>  	struct psi_trigger *t;
>  	enum psi_states state;
> -	u32 threshold_us;
> +	u32 min_threshold_us;
> +	u32 max_threshold_us;
>  	u32 window_us;
>  
>  	if (static_branch_likely(&psi_disabled))
>  		return ERR_PTR(-EOPNOTSUPP);
>  
> -	if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
> +	if (sscanf(buf, "some %u %u %u", &min_threshold_us,
> +				&max_threshold_us, &window_us) == 3)
>  		state = PSI_IO_SOME + res * 2;
> -	else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
> +	else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
> +				&max_threshold_us, &window_us) == 3)
>  		state = PSI_IO_FULL + res * 2;
>  	else
>  		return ERR_PTR(-EINVAL);
> @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>  		window_us > WINDOW_MAX_US)
>  		return ERR_PTR(-EINVAL);
>  
> +	if (min_threshold_us >= max_threshold_us)
> +		return ERR_PTR(-EINVAL);
> +
>  	/* Check threshold */
> -	if (threshold_us == 0 || threshold_us > window_us)
> +	if (max_threshold_us > window_us)
>  		return ERR_PTR(-EINVAL);
>  
>  	t = kmalloc(sizeof(*t), GFP_KERNEL);
> @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>  
>  	t->group = group;
>  	t->state = state;
> -	t->threshold = threshold_us * NSEC_PER_USEC;
> +	t->min_threshold = min_threshold_us * NSEC_PER_USEC;
> +	t->max_threshold = max_threshold_us * NSEC_PER_USEC;
>  	t->win.size = window_us * NSEC_PER_USEC;
>  	window_reset(&t->win, 0, 0, 0);
>  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-16  6:20 ` [PATCH 1/2] psi: add support for multi level pressure stall trigger Alex Shi
@ 2022-05-16  8:21   ` Suren Baghdasaryan
  2022-05-16  8:43     ` Suren Baghdasaryan
  2022-05-17  9:38     ` Chen Wandun
  2022-05-17  9:08   ` Chen Wandun
  1 sibling, 2 replies; 17+ messages in thread
From: Suren Baghdasaryan @ 2022-05-16  8:21 UTC (permalink / raw)
  To: Alex Shi
  Cc: Chen Wandun, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION

On Sun, May 15, 2022 at 11:20 PM Alex Shi <seakeel@gmail.com> wrote:
>
>
>
> On 5/16/22 11:35, Chen Wandun wrote:
> > Nowadays, psi events are triggered when stall time exceed
> > stall threshold, but no any different between these events.
> >
> > Actually, events can be divide into multi level, each level
> > represent a different stall pressure, that is help to identify
> > pressure information more accurately.

IIUC by defining min and max, you want the trigger to activate when
the stall is between min and max thresholds. But I don't see why you
would need that. If you want to have several levels, you can create
multiple triggers and monitor them separately. For your example, that
would be:

echo "some 150000 1000000" > /proc/pressure/memory
echo "some 350000 1000000" > /proc/pressure/memory

Your first trigger will fire whenever the stall exceeds 150ms within
each 1sec and the second one will trigger when it exceeds 350ms. It is
true that if the stall jumps sharply above 350ms, you would get both
triggers firing. I'm guessing that's why you want this functionality
so that 150ms trigger does not fire when 350ms one is firing but why
is that a problem? Can't userspace pick the highest level one and
ignore all the lower ones when this happens? Or are you addressing
some other requirement?

> >
> > echo "some 150000 350000 1000000" > /proc/pressure/memory would
>
> This breaks the old ABI. And why you need this new function?

Both great points.

>
> Thanks
>
> > add [150ms, 350ms) threshold for partial memory stall measured
> > within 1sec time window.
> >
> > Signed-off-by: Chen Wandun <chenwandun@huawei.com>
> > ---
> >  include/linux/psi_types.h |  3 ++-
> >  kernel/sched/psi.c        | 19 +++++++++++++------
> >  2 files changed, 15 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
> > index c7fe7c089718..2b1393c8bf90 100644
> > --- a/include/linux/psi_types.h
> > +++ b/include/linux/psi_types.h
> > @@ -119,7 +119,8 @@ struct psi_trigger {
> >       enum psi_states state;
> >
> >       /* User-spacified threshold in ns */
> > -     u64 threshold;
> > +     u64 min_threshold;
> > +     u64 max_threshold;
> >
> >       /* List node inside triggers list */
> >       struct list_head node;
> > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > index 6f9533c95b0a..17dd233b533a 100644
> > --- a/kernel/sched/psi.c
> > +++ b/kernel/sched/psi.c
> > @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
> >
> >                       /* Calculate growth since last update */
> >                       growth = window_update(&t->win, now, total[t->state]);
> > -                     if (growth < t->threshold)
> > +                     if (growth < t->min_threshold || growth >= t->max_threshold)
> >                               continue;
> >
> >                       t->pending_event = true;
> > @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >  {
> >       struct psi_trigger *t;
> >       enum psi_states state;
> > -     u32 threshold_us;
> > +     u32 min_threshold_us;
> > +     u32 max_threshold_us;
> >       u32 window_us;
> >
> >       if (static_branch_likely(&psi_disabled))
> >               return ERR_PTR(-EOPNOTSUPP);
> >
> > -     if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
> > +     if (sscanf(buf, "some %u %u %u", &min_threshold_us,
> > +                             &max_threshold_us, &window_us) == 3)
> >               state = PSI_IO_SOME + res * 2;
> > -     else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
> > +     else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
> > +                             &max_threshold_us, &window_us) == 3)
> >               state = PSI_IO_FULL + res * 2;
> >       else
> >               return ERR_PTR(-EINVAL);
> > @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >               window_us > WINDOW_MAX_US)
> >               return ERR_PTR(-EINVAL);
> >
> > +     if (min_threshold_us >= max_threshold_us)
> > +             return ERR_PTR(-EINVAL);
> > +
> >       /* Check threshold */
> > -     if (threshold_us == 0 || threshold_us > window_us)
> > +     if (max_threshold_us > window_us)
> >               return ERR_PTR(-EINVAL);
> >
> >       t = kmalloc(sizeof(*t), GFP_KERNEL);
> > @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >
> >       t->group = group;
> >       t->state = state;
> > -     t->threshold = threshold_us * NSEC_PER_USEC;
> > +     t->min_threshold = min_threshold_us * NSEC_PER_USEC;
> > +     t->max_threshold = max_threshold_us * NSEC_PER_USEC;
> >       t->win.size = window_us * NSEC_PER_USEC;
> >       window_reset(&t->win, 0, 0, 0);
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-16  8:21   ` Suren Baghdasaryan
@ 2022-05-16  8:43     ` Suren Baghdasaryan
  2022-05-17 12:46       ` Chen Wandun
  2022-05-17  9:38     ` Chen Wandun
  1 sibling, 1 reply; 17+ messages in thread
From: Suren Baghdasaryan @ 2022-05-16  8:43 UTC (permalink / raw)
  To: Alex Shi
  Cc: Chen Wandun, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION

On Mon, May 16, 2022 at 1:21 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Sun, May 15, 2022 at 11:20 PM Alex Shi <seakeel@gmail.com> wrote:
> >
> >
> >
> > On 5/16/22 11:35, Chen Wandun wrote:
> > > Nowadays, psi events are triggered when stall time exceed
> > > stall threshold, but no any different between these events.
> > >
> > > Actually, events can be divide into multi level, each level
> > > represent a different stall pressure, that is help to identify
> > > pressure information more accurately.
>
> IIUC by defining min and max, you want the trigger to activate when
> the stall is between min and max thresholds. But I don't see why you
> would need that. If you want to have several levels, you can create
> multiple triggers and monitor them separately. For your example, that
> would be:
>
> echo "some 150000 1000000" > /proc/pressure/memory
> echo "some 350000 1000000" > /proc/pressure/memory
>
> Your first trigger will fire whenever the stall exceeds 150ms within
> each 1sec and the second one will trigger when it exceeds 350ms. It is
> true that if the stall jumps sharply above 350ms, you would get both
> triggers firing. I'm guessing that's why you want this functionality
> so that 150ms trigger does not fire when 350ms one is firing but why
> is that a problem? Can't userspace pick the highest level one and
> ignore all the lower ones when this happens? Or are you addressing
> some other requirement?
>
> > >
> > > echo "some 150000 350000 1000000" > /proc/pressure/memory would
> >
> > This breaks the old ABI. And why you need this new function?
>
> Both great points.

BTW, I think the additional max_threshold parameter could be
implemented in a backward compatible way so that the old API is not
broken:

arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
if (arg_count < 2) return ERR_PTR(-EINVAL);
if (arg_count < 3) {
    max_threshold_us = INT_MAX;
    window_us = arg2;
} else {
    max_threshold_us = arg2;
    window_us = arg3;
}

But again, the motivation still needs to be explained.

>
> >
> > Thanks
> >
> > > add [150ms, 350ms) threshold for partial memory stall measured
> > > within 1sec time window.
> > >
> > > Signed-off-by: Chen Wandun <chenwandun@huawei.com>
> > > ---
> > >  include/linux/psi_types.h |  3 ++-
> > >  kernel/sched/psi.c        | 19 +++++++++++++------
> > >  2 files changed, 15 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
> > > index c7fe7c089718..2b1393c8bf90 100644
> > > --- a/include/linux/psi_types.h
> > > +++ b/include/linux/psi_types.h
> > > @@ -119,7 +119,8 @@ struct psi_trigger {
> > >       enum psi_states state;
> > >
> > >       /* User-spacified threshold in ns */
> > > -     u64 threshold;
> > > +     u64 min_threshold;
> > > +     u64 max_threshold;
> > >
> > >       /* List node inside triggers list */
> > >       struct list_head node;
> > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > > index 6f9533c95b0a..17dd233b533a 100644
> > > --- a/kernel/sched/psi.c
> > > +++ b/kernel/sched/psi.c
> > > @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
> > >
> > >                       /* Calculate growth since last update */
> > >                       growth = window_update(&t->win, now, total[t->state]);
> > > -                     if (growth < t->threshold)
> > > +                     if (growth < t->min_threshold || growth >= t->max_threshold)
> > >                               continue;
> > >
> > >                       t->pending_event = true;
> > > @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> > >  {
> > >       struct psi_trigger *t;
> > >       enum psi_states state;
> > > -     u32 threshold_us;
> > > +     u32 min_threshold_us;
> > > +     u32 max_threshold_us;
> > >       u32 window_us;
> > >
> > >       if (static_branch_likely(&psi_disabled))
> > >               return ERR_PTR(-EOPNOTSUPP);
> > >
> > > -     if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
> > > +     if (sscanf(buf, "some %u %u %u", &min_threshold_us,
> > > +                             &max_threshold_us, &window_us) == 3)
> > >               state = PSI_IO_SOME + res * 2;
> > > -     else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
> > > +     else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
> > > +                             &max_threshold_us, &window_us) == 3)
> > >               state = PSI_IO_FULL + res * 2;
> > >       else
> > >               return ERR_PTR(-EINVAL);
> > > @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> > >               window_us > WINDOW_MAX_US)
> > >               return ERR_PTR(-EINVAL);
> > >
> > > +     if (min_threshold_us >= max_threshold_us)
> > > +             return ERR_PTR(-EINVAL);
> > > +
> > >       /* Check threshold */
> > > -     if (threshold_us == 0 || threshold_us > window_us)
> > > +     if (max_threshold_us > window_us)
> > >               return ERR_PTR(-EINVAL);
> > >
> > >       t = kmalloc(sizeof(*t), GFP_KERNEL);
> > > @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> > >
> > >       t->group = group;
> > >       t->state = state;
> > > -     t->threshold = threshold_us * NSEC_PER_USEC;
> > > +     t->min_threshold = min_threshold_us * NSEC_PER_USEC;
> > > +     t->max_threshold = max_threshold_us * NSEC_PER_USEC;
> > >       t->win.size = window_us * NSEC_PER_USEC;
> > >       window_reset(&t->win, 0, 0, 0);
> > >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-16  6:20 ` [PATCH 1/2] psi: add support for multi level pressure stall trigger Alex Shi
  2022-05-16  8:21   ` Suren Baghdasaryan
@ 2022-05-17  9:08   ` Chen Wandun
  1 sibling, 0 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-17  9:08 UTC (permalink / raw)
  To: Alex Shi, linux-kernel, hannes, surenb, alexs, corbet, linux-doc



在 2022/5/16 14:20, Alex Shi 写道:
>
> On 5/16/22 11:35, Chen Wandun wrote:
>> Nowadays, psi events are triggered when stall time exceed
>> stall threshold, but no any different between these events.
>>
>> Actually, events can be divide into multi level, each level
>> represent a different stall pressure, that is help to identify
>> pressure information more accurately.
>>
>> echo "some 150000 350000 1000000" > /proc/pressure/memory would
> This breaks the old ABI. And why you need this new function?
We want to do different measures according to different stall levels,

In small stall case, maybe only low level warning is needed,  In big

stall case, maybe aggressive memory reclaim is needed.

so it is necessary to distinguish different levels.
>
> Thanks
>
>> add [150ms, 350ms) threshold for partial memory stall measured
>> within 1sec time window.
>>
>> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
>> ---
>>   include/linux/psi_types.h |  3 ++-
>>   kernel/sched/psi.c        | 19 +++++++++++++------
>>   2 files changed, 15 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
>> index c7fe7c089718..2b1393c8bf90 100644
>> --- a/include/linux/psi_types.h
>> +++ b/include/linux/psi_types.h
>> @@ -119,7 +119,8 @@ struct psi_trigger {
>>   	enum psi_states state;
>>   
>>   	/* User-spacified threshold in ns */
>> -	u64 threshold;
>> +	u64 min_threshold;
>> +	u64 max_threshold;
>>   
>>   	/* List node inside triggers list */
>>   	struct list_head node;
>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>> index 6f9533c95b0a..17dd233b533a 100644
>> --- a/kernel/sched/psi.c
>> +++ b/kernel/sched/psi.c
>> @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
>>   
>>   			/* Calculate growth since last update */
>>   			growth = window_update(&t->win, now, total[t->state]);
>> -			if (growth < t->threshold)
>> +			if (growth < t->min_threshold || growth >= t->max_threshold)
>>   				continue;
>>   
>>   			t->pending_event = true;
>> @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>   {
>>   	struct psi_trigger *t;
>>   	enum psi_states state;
>> -	u32 threshold_us;
>> +	u32 min_threshold_us;
>> +	u32 max_threshold_us;
>>   	u32 window_us;
>>   
>>   	if (static_branch_likely(&psi_disabled))
>>   		return ERR_PTR(-EOPNOTSUPP);
>>   
>> -	if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
>> +	if (sscanf(buf, "some %u %u %u", &min_threshold_us,
>> +				&max_threshold_us, &window_us) == 3)
>>   		state = PSI_IO_SOME + res * 2;
>> -	else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
>> +	else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
>> +				&max_threshold_us, &window_us) == 3)
>>   		state = PSI_IO_FULL + res * 2;
>>   	else
>>   		return ERR_PTR(-EINVAL);
>> @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>   		window_us > WINDOW_MAX_US)
>>   		return ERR_PTR(-EINVAL);
>>   
>> +	if (min_threshold_us >= max_threshold_us)
>> +		return ERR_PTR(-EINVAL);
>> +
>>   	/* Check threshold */
>> -	if (threshold_us == 0 || threshold_us > window_us)
>> +	if (max_threshold_us > window_us)
>>   		return ERR_PTR(-EINVAL);
>>   
>>   	t = kmalloc(sizeof(*t), GFP_KERNEL);
>> @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>   
>>   	t->group = group;
>>   	t->state = state;
>> -	t->threshold = threshold_us * NSEC_PER_USEC;
>> +	t->min_threshold = min_threshold_us * NSEC_PER_USEC;
>> +	t->max_threshold = max_threshold_us * NSEC_PER_USEC;
>>   	t->win.size = window_us * NSEC_PER_USEC;
>>   	window_reset(&t->win, 0, 0, 0);
>>   
> .


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-16  8:21   ` Suren Baghdasaryan
  2022-05-16  8:43     ` Suren Baghdasaryan
@ 2022-05-17  9:38     ` Chen Wandun
  1 sibling, 0 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-17  9:38 UTC (permalink / raw)
  To: Suren Baghdasaryan, Alex Shi
  Cc: LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



在 2022/5/16 16:21, Suren Baghdasaryan 写道:
> On Sun, May 15, 2022 at 11:20 PM Alex Shi <seakeel@gmail.com> wrote:
>>
>>
>> On 5/16/22 11:35, Chen Wandun wrote:
>>> Nowadays, psi events are triggered when stall time exceed
>>> stall threshold, but no any different between these events.
>>>
>>> Actually, events can be divide into multi level, each level
>>> represent a different stall pressure, that is help to identify
>>> pressure information more accurately.
> IIUC by defining min and max, you want the trigger to activate when
> the stall is between min and max thresholds. But I don't see why you
> would need that. If you want to have several levels, you can create
> multiple triggers and monitor them separately. For your example, that
> would be:
>
> echo "some 150000 1000000" > /proc/pressure/memory
> echo "some 350000 1000000" > /proc/pressure/memory
>
> Your first trigger will fire whenever the stall exceeds 150ms within
> each 1sec and the second one will trigger when it exceeds 350ms. It is
> true that if the stall jumps sharply above 350ms, you would get both
> triggers firing. I'm guessing that's why you want this functionality
> so that 150ms trigger does not fire when 350ms one is firing but why
Yes, if stall time above 350ms, I hope only one trigger fire.
> is that a problem? Can't userspace pick the highest level one and
> ignore all the lower ones when this happens? Or are you addressing
> some other requirement?
Userspace can pick the higest level, but more triggers fire, actually one
trigger fire is enough in this case, and userspace  become more complex.

new trigger is compatible with old.

echo "some 150000 1000000 1000000 " > /proc/pressure/memory

can achieve the same goal with

echo "some 150000 1000000 " > /proc/pressure/memory
>
>>> echo "some 150000 350000 1000000" > /proc/pressure/memory would
>> This breaks the old ABI. And why you need this new function?
> Both great points.
>
>> Thanks
>>
>>> add [150ms, 350ms) threshold for partial memory stall measured
>>> within 1sec time window.
>>>
>>> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
>>> ---
>>>   include/linux/psi_types.h |  3 ++-
>>>   kernel/sched/psi.c        | 19 +++++++++++++------
>>>   2 files changed, 15 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
>>> index c7fe7c089718..2b1393c8bf90 100644
>>> --- a/include/linux/psi_types.h
>>> +++ b/include/linux/psi_types.h
>>> @@ -119,7 +119,8 @@ struct psi_trigger {
>>>        enum psi_states state;
>>>
>>>        /* User-spacified threshold in ns */
>>> -     u64 threshold;
>>> +     u64 min_threshold;
>>> +     u64 max_threshold;
>>>
>>>        /* List node inside triggers list */
>>>        struct list_head node;
>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>> index 6f9533c95b0a..17dd233b533a 100644
>>> --- a/kernel/sched/psi.c
>>> +++ b/kernel/sched/psi.c
>>> @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
>>>
>>>                        /* Calculate growth since last update */
>>>                        growth = window_update(&t->win, now, total[t->state]);
>>> -                     if (growth < t->threshold)
>>> +                     if (growth < t->min_threshold || growth >= t->max_threshold)
>>>                                continue;
>>>
>>>                        t->pending_event = true;
>>> @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>   {
>>>        struct psi_trigger *t;
>>>        enum psi_states state;
>>> -     u32 threshold_us;
>>> +     u32 min_threshold_us;
>>> +     u32 max_threshold_us;
>>>        u32 window_us;
>>>
>>>        if (static_branch_likely(&psi_disabled))
>>>                return ERR_PTR(-EOPNOTSUPP);
>>>
>>> -     if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
>>> +     if (sscanf(buf, "some %u %u %u", &min_threshold_us,
>>> +                             &max_threshold_us, &window_us) == 3)
>>>                state = PSI_IO_SOME + res * 2;
>>> -     else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
>>> +     else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
>>> +                             &max_threshold_us, &window_us) == 3)
>>>                state = PSI_IO_FULL + res * 2;
>>>        else
>>>                return ERR_PTR(-EINVAL);
>>> @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>                window_us > WINDOW_MAX_US)
>>>                return ERR_PTR(-EINVAL);
>>>
>>> +     if (min_threshold_us >= max_threshold_us)
>>> +             return ERR_PTR(-EINVAL);
>>> +
>>>        /* Check threshold */
>>> -     if (threshold_us == 0 || threshold_us > window_us)
>>> +     if (max_threshold_us > window_us)
>>>                return ERR_PTR(-EINVAL);
>>>
>>>        t = kmalloc(sizeof(*t), GFP_KERNEL);
>>> @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>
>>>        t->group = group;
>>>        t->state = state;
>>> -     t->threshold = threshold_us * NSEC_PER_USEC;
>>> +     t->min_threshold = min_threshold_us * NSEC_PER_USEC;
>>> +     t->max_threshold = max_threshold_us * NSEC_PER_USEC;
>>>        t->win.size = window_us * NSEC_PER_USEC;
>>>        window_reset(&t->win, 0, 0, 0);
>>>
> .


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-16  8:43     ` Suren Baghdasaryan
@ 2022-05-17 12:46       ` Chen Wandun
  2022-05-17 17:35         ` Suren Baghdasaryan
  2022-05-18 10:29         ` Alex Shi
  0 siblings, 2 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-17 12:46 UTC (permalink / raw)
  To: Suren Baghdasaryan, Alex Shi
  Cc: LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



在 2022/5/16 16:43, Suren Baghdasaryan 写道:
> On Mon, May 16, 2022 at 1:21 AM Suren Baghdasaryan <surenb@google.com> wrote:
>> On Sun, May 15, 2022 at 11:20 PM Alex Shi <seakeel@gmail.com> wrote:
>>>
>>>
>>> On 5/16/22 11:35, Chen Wandun wrote:
>>>> Nowadays, psi events are triggered when stall time exceed
>>>> stall threshold, but no any different between these events.
>>>>
>>>> Actually, events can be divide into multi level, each level
>>>> represent a different stall pressure, that is help to identify
>>>> pressure information more accurately.
>> IIUC by defining min and max, you want the trigger to activate when
>> the stall is between min and max thresholds. But I don't see why you
>> would need that. If you want to have several levels, you can create
>> multiple triggers and monitor them separately. For your example, that
>> would be:
>>
>> echo "some 150000 1000000" > /proc/pressure/memory
>> echo "some 350000 1000000" > /proc/pressure/memory
>>
>> Your first trigger will fire whenever the stall exceeds 150ms within
>> each 1sec and the second one will trigger when it exceeds 350ms. It is
>> true that if the stall jumps sharply above 350ms, you would get both
>> triggers firing. I'm guessing that's why you want this functionality
>> so that 150ms trigger does not fire when 350ms one is firing but why
>> is that a problem? Can't userspace pick the highest level one and
>> ignore all the lower ones when this happens? Or are you addressing
>> some other requirement?
>>
>>>> echo "some 150000 350000 1000000" > /proc/pressure/memory would
>>> This breaks the old ABI. And why you need this new function?
>> Both great points.
> BTW, I think the additional max_threshold parameter could be
> implemented in a backward compatible way so that the old API is not
> broken:
>
> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
> if (arg_count < 2) return ERR_PTR(-EINVAL);
> if (arg_count < 3) {
>      max_threshold_us = INT_MAX;
>      window_us = arg2;
> } else {
>      max_threshold_us = arg2;
>      window_us = arg3;
> }
OK

Thanks.
> But again, the motivation still needs to be explained.
we want do different operation for different stall level,
just as prev email explain, multi trigger is also OK in old
ways, but it is a litter complex.

>
>>> Thanks
>>>
>>>> add [150ms, 350ms) threshold for partial memory stall measured
>>>> within 1sec time window.
>>>>
>>>> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
>>>> ---
>>>>   include/linux/psi_types.h |  3 ++-
>>>>   kernel/sched/psi.c        | 19 +++++++++++++------
>>>>   2 files changed, 15 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
>>>> index c7fe7c089718..2b1393c8bf90 100644
>>>> --- a/include/linux/psi_types.h
>>>> +++ b/include/linux/psi_types.h
>>>> @@ -119,7 +119,8 @@ struct psi_trigger {
>>>>        enum psi_states state;
>>>>
>>>>        /* User-spacified threshold in ns */
>>>> -     u64 threshold;
>>>> +     u64 min_threshold;
>>>> +     u64 max_threshold;
>>>>
>>>>        /* List node inside triggers list */
>>>>        struct list_head node;
>>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>>> index 6f9533c95b0a..17dd233b533a 100644
>>>> --- a/kernel/sched/psi.c
>>>> +++ b/kernel/sched/psi.c
>>>> @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
>>>>
>>>>                        /* Calculate growth since last update */
>>>>                        growth = window_update(&t->win, now, total[t->state]);
>>>> -                     if (growth < t->threshold)
>>>> +                     if (growth < t->min_threshold || growth >= t->max_threshold)
>>>>                                continue;
>>>>
>>>>                        t->pending_event = true;
>>>> @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>>   {
>>>>        struct psi_trigger *t;
>>>>        enum psi_states state;
>>>> -     u32 threshold_us;
>>>> +     u32 min_threshold_us;
>>>> +     u32 max_threshold_us;
>>>>        u32 window_us;
>>>>
>>>>        if (static_branch_likely(&psi_disabled))
>>>>                return ERR_PTR(-EOPNOTSUPP);
>>>>
>>>> -     if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
>>>> +     if (sscanf(buf, "some %u %u %u", &min_threshold_us,
>>>> +                             &max_threshold_us, &window_us) == 3)
>>>>                state = PSI_IO_SOME + res * 2;
>>>> -     else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
>>>> +     else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
>>>> +                             &max_threshold_us, &window_us) == 3)
>>>>                state = PSI_IO_FULL + res * 2;
>>>>        else
>>>>                return ERR_PTR(-EINVAL);
>>>> @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>>                window_us > WINDOW_MAX_US)
>>>>                return ERR_PTR(-EINVAL);
>>>>
>>>> +     if (min_threshold_us >= max_threshold_us)
>>>> +             return ERR_PTR(-EINVAL);
>>>> +
>>>>        /* Check threshold */
>>>> -     if (threshold_us == 0 || threshold_us > window_us)
>>>> +     if (max_threshold_us > window_us)
>>>>                return ERR_PTR(-EINVAL);
>>>>
>>>>        t = kmalloc(sizeof(*t), GFP_KERNEL);
>>>> @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>>
>>>>        t->group = group;
>>>>        t->state = state;
>>>> -     t->threshold = threshold_us * NSEC_PER_USEC;
>>>> +     t->min_threshold = min_threshold_us * NSEC_PER_USEC;
>>>> +     t->max_threshold = max_threshold_us * NSEC_PER_USEC;
>>>>        t->win.size = window_us * NSEC_PER_USEC;
>>>>        window_reset(&t->win, 0, 0, 0);
>>>>
> .


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-17 12:46       ` Chen Wandun
@ 2022-05-17 17:35         ` Suren Baghdasaryan
  2022-05-18  9:55           ` Chen Wandun
  2022-05-18 10:29         ` Alex Shi
  1 sibling, 1 reply; 17+ messages in thread
From: Suren Baghdasaryan @ 2022-05-17 17:35 UTC (permalink / raw)
  To: Chen Wandun
  Cc: Alex Shi, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION

On Tue, May 17, 2022 at 5:46 AM Chen Wandun <chenwandun@huawei.com> wrote:
>
>
>
> 在 2022/5/16 16:43, Suren Baghdasaryan 写道:
> > On Mon, May 16, 2022 at 1:21 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >> On Sun, May 15, 2022 at 11:20 PM Alex Shi <seakeel@gmail.com> wrote:
> >>>
> >>>
> >>> On 5/16/22 11:35, Chen Wandun wrote:
> >>>> Nowadays, psi events are triggered when stall time exceed
> >>>> stall threshold, but no any different between these events.
> >>>>
> >>>> Actually, events can be divide into multi level, each level
> >>>> represent a different stall pressure, that is help to identify
> >>>> pressure information more accurately.
> >> IIUC by defining min and max, you want the trigger to activate when
> >> the stall is between min and max thresholds. But I don't see why you
> >> would need that. If you want to have several levels, you can create
> >> multiple triggers and monitor them separately. For your example, that
> >> would be:
> >>
> >> echo "some 150000 1000000" > /proc/pressure/memory
> >> echo "some 350000 1000000" > /proc/pressure/memory
> >>
> >> Your first trigger will fire whenever the stall exceeds 150ms within
> >> each 1sec and the second one will trigger when it exceeds 350ms. It is
> >> true that if the stall jumps sharply above 350ms, you would get both
> >> triggers firing. I'm guessing that's why you want this functionality
> >> so that 150ms trigger does not fire when 350ms one is firing but why
> >> is that a problem? Can't userspace pick the highest level one and
> >> ignore all the lower ones when this happens? Or are you addressing
> >> some other requirement?
> >>
> >>>> echo "some 150000 350000 1000000" > /proc/pressure/memory would
> >>> This breaks the old ABI. And why you need this new function?
> >> Both great points.
> > BTW, I think the additional max_threshold parameter could be
> > implemented in a backward compatible way so that the old API is not
> > broken:
> >
> > arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
> > if (arg_count < 2) return ERR_PTR(-EINVAL);
> > if (arg_count < 3) {
> >      max_threshold_us = INT_MAX;
> >      window_us = arg2;
> > } else {
> >      max_threshold_us = arg2;
> >      window_us = arg3;
> > }
> OK
>
> Thanks.
> > But again, the motivation still needs to be explained.
> we want do different operation for different stall level,
> just as prev email explain, multi trigger is also OK in old
> ways, but it is a litter complex.

Ok, so the issue can be dealt with in the userspace but would make it
simpler if max_threashold is supported by the kernel. I can buy this
argument if the kernel implementation is not complex and max_threshold
is added in a way that does not break current users. I believe both
conditions can be met.

>
> >
> >>> Thanks
> >>>
> >>>> add [150ms, 350ms) threshold for partial memory stall measured
> >>>> within 1sec time window.
> >>>>
> >>>> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
> >>>> ---
> >>>>   include/linux/psi_types.h |  3 ++-
> >>>>   kernel/sched/psi.c        | 19 +++++++++++++------
> >>>>   2 files changed, 15 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
> >>>> index c7fe7c089718..2b1393c8bf90 100644
> >>>> --- a/include/linux/psi_types.h
> >>>> +++ b/include/linux/psi_types.h
> >>>> @@ -119,7 +119,8 @@ struct psi_trigger {
> >>>>        enum psi_states state;
> >>>>
> >>>>        /* User-spacified threshold in ns */
> >>>> -     u64 threshold;
> >>>> +     u64 min_threshold;
> >>>> +     u64 max_threshold;
> >>>>
> >>>>        /* List node inside triggers list */
> >>>>        struct list_head node;
> >>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> >>>> index 6f9533c95b0a..17dd233b533a 100644
> >>>> --- a/kernel/sched/psi.c
> >>>> +++ b/kernel/sched/psi.c
> >>>> @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
> >>>>
> >>>>                        /* Calculate growth since last update */
> >>>>                        growth = window_update(&t->win, now, total[t->state]);
> >>>> -                     if (growth < t->threshold)
> >>>> +                     if (growth < t->min_threshold || growth >= t->max_threshold)
> >>>>                                continue;
> >>>>
> >>>>                        t->pending_event = true;
> >>>> @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >>>>   {
> >>>>        struct psi_trigger *t;
> >>>>        enum psi_states state;
> >>>> -     u32 threshold_us;
> >>>> +     u32 min_threshold_us;
> >>>> +     u32 max_threshold_us;
> >>>>        u32 window_us;
> >>>>
> >>>>        if (static_branch_likely(&psi_disabled))
> >>>>                return ERR_PTR(-EOPNOTSUPP);
> >>>>
> >>>> -     if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
> >>>> +     if (sscanf(buf, "some %u %u %u", &min_threshold_us,
> >>>> +                             &max_threshold_us, &window_us) == 3)
> >>>>                state = PSI_IO_SOME + res * 2;
> >>>> -     else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
> >>>> +     else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
> >>>> +                             &max_threshold_us, &window_us) == 3)
> >>>>                state = PSI_IO_FULL + res * 2;
> >>>>        else
> >>>>                return ERR_PTR(-EINVAL);
> >>>> @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >>>>                window_us > WINDOW_MAX_US)
> >>>>                return ERR_PTR(-EINVAL);
> >>>>
> >>>> +     if (min_threshold_us >= max_threshold_us)
> >>>> +             return ERR_PTR(-EINVAL);
> >>>> +
> >>>>        /* Check threshold */
> >>>> -     if (threshold_us == 0 || threshold_us > window_us)
> >>>> +     if (max_threshold_us > window_us)
> >>>>                return ERR_PTR(-EINVAL);
> >>>>
> >>>>        t = kmalloc(sizeof(*t), GFP_KERNEL);
> >>>> @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >>>>
> >>>>        t->group = group;
> >>>>        t->state = state;
> >>>> -     t->threshold = threshold_us * NSEC_PER_USEC;
> >>>> +     t->min_threshold = min_threshold_us * NSEC_PER_USEC;
> >>>> +     t->max_threshold = max_threshold_us * NSEC_PER_USEC;
> >>>>        t->win.size = window_us * NSEC_PER_USEC;
> >>>>        window_reset(&t->win, 0, 0, 0);
> >>>>
> > .
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-17 17:35         ` Suren Baghdasaryan
@ 2022-05-18  9:55           ` Chen Wandun
  0 siblings, 0 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-18  9:55 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Alex Shi, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



在 2022/5/18 1:35, Suren Baghdasaryan 写道:
> On Tue, May 17, 2022 at 5:46 AM Chen Wandun <chenwandun@huawei.com> wrote:
>>
>>
>> 在 2022/5/16 16:43, Suren Baghdasaryan 写道:
>>> On Mon, May 16, 2022 at 1:21 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>> On Sun, May 15, 2022 at 11:20 PM Alex Shi <seakeel@gmail.com> wrote:
>>>>>
>>>>> On 5/16/22 11:35, Chen Wandun wrote:
>>>>>> Nowadays, psi events are triggered when stall time exceed
>>>>>> stall threshold, but no any different between these events.
>>>>>>
>>>>>> Actually, events can be divide into multi level, each level
>>>>>> represent a different stall pressure, that is help to identify
>>>>>> pressure information more accurately.
>>>> IIUC by defining min and max, you want the trigger to activate when
>>>> the stall is between min and max thresholds. But I don't see why you
>>>> would need that. If you want to have several levels, you can create
>>>> multiple triggers and monitor them separately. For your example, that
>>>> would be:
>>>>
>>>> echo "some 150000 1000000" > /proc/pressure/memory
>>>> echo "some 350000 1000000" > /proc/pressure/memory
>>>>
>>>> Your first trigger will fire whenever the stall exceeds 150ms within
>>>> each 1sec and the second one will trigger when it exceeds 350ms. It is
>>>> true that if the stall jumps sharply above 350ms, you would get both
>>>> triggers firing. I'm guessing that's why you want this functionality
>>>> so that 150ms trigger does not fire when 350ms one is firing but why
>>>> is that a problem? Can't userspace pick the highest level one and
>>>> ignore all the lower ones when this happens? Or are you addressing
>>>> some other requirement?
>>>>
>>>>>> echo "some 150000 350000 1000000" > /proc/pressure/memory would
>>>>> This breaks the old ABI. And why you need this new function?
>>>> Both great points.
>>> BTW, I think the additional max_threshold parameter could be
>>> implemented in a backward compatible way so that the old API is not
>>> broken:
>>>
>>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
>>> if (arg_count < 2) return ERR_PTR(-EINVAL);
>>> if (arg_count < 3) {
>>>       max_threshold_us = INT_MAX;
>>>       window_us = arg2;
>>> } else {
>>>       max_threshold_us = arg2;
>>>       window_us = arg3;
>>> }
>> OK
>>
>> Thanks.
>>> But again, the motivation still needs to be explained.
>> we want do different operation for different stall level,
>> just as prev email explain, multi trigger is also OK in old
>> ways, but it is a litter complex.
> Ok, so the issue can be dealt with in the userspace but would make it
> simpler if max_threashold is supported by the kernel. I can buy this
> argument if the kernel implementation is not complex and max_threshold
> is added in a way that does not break current users. I believe both
> conditions can be met.
OK, I will send v2

Thanks
>
>>>>> Thanks
>>>>>
>>>>>> add [150ms, 350ms) threshold for partial memory stall measured
>>>>>> within 1sec time window.
>>>>>>
>>>>>> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
>>>>>> ---
>>>>>>    include/linux/psi_types.h |  3 ++-
>>>>>>    kernel/sched/psi.c        | 19 +++++++++++++------
>>>>>>    2 files changed, 15 insertions(+), 7 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
>>>>>> index c7fe7c089718..2b1393c8bf90 100644
>>>>>> --- a/include/linux/psi_types.h
>>>>>> +++ b/include/linux/psi_types.h
>>>>>> @@ -119,7 +119,8 @@ struct psi_trigger {
>>>>>>         enum psi_states state;
>>>>>>
>>>>>>         /* User-spacified threshold in ns */
>>>>>> -     u64 threshold;
>>>>>> +     u64 min_threshold;
>>>>>> +     u64 max_threshold;
>>>>>>
>>>>>>         /* List node inside triggers list */
>>>>>>         struct list_head node;
>>>>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>>>>> index 6f9533c95b0a..17dd233b533a 100644
>>>>>> --- a/kernel/sched/psi.c
>>>>>> +++ b/kernel/sched/psi.c
>>>>>> @@ -541,7 +541,7 @@ static u64 update_triggers(struct psi_group *group, u64 now)
>>>>>>
>>>>>>                         /* Calculate growth since last update */
>>>>>>                         growth = window_update(&t->win, now, total[t->state]);
>>>>>> -                     if (growth < t->threshold)
>>>>>> +                     if (growth < t->min_threshold || growth >= t->max_threshold)
>>>>>>                                 continue;
>>>>>>
>>>>>>                         t->pending_event = true;
>>>>>> @@ -1087,15 +1087,18 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>>>>    {
>>>>>>         struct psi_trigger *t;
>>>>>>         enum psi_states state;
>>>>>> -     u32 threshold_us;
>>>>>> +     u32 min_threshold_us;
>>>>>> +     u32 max_threshold_us;
>>>>>>         u32 window_us;
>>>>>>
>>>>>>         if (static_branch_likely(&psi_disabled))
>>>>>>                 return ERR_PTR(-EOPNOTSUPP);
>>>>>>
>>>>>> -     if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
>>>>>> +     if (sscanf(buf, "some %u %u %u", &min_threshold_us,
>>>>>> +                             &max_threshold_us, &window_us) == 3)
>>>>>>                 state = PSI_IO_SOME + res * 2;
>>>>>> -     else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
>>>>>> +     else if (sscanf(buf, "full %u %u %u", &min_threshold_us,
>>>>>> +                             &max_threshold_us, &window_us) == 3)
>>>>>>                 state = PSI_IO_FULL + res * 2;
>>>>>>         else
>>>>>>                 return ERR_PTR(-EINVAL);
>>>>>> @@ -1107,8 +1110,11 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>>>>                 window_us > WINDOW_MAX_US)
>>>>>>                 return ERR_PTR(-EINVAL);
>>>>>>
>>>>>> +     if (min_threshold_us >= max_threshold_us)
>>>>>> +             return ERR_PTR(-EINVAL);
>>>>>> +
>>>>>>         /* Check threshold */
>>>>>> -     if (threshold_us == 0 || threshold_us > window_us)
>>>>>> +     if (max_threshold_us > window_us)
>>>>>>                 return ERR_PTR(-EINVAL);
>>>>>>
>>>>>>         t = kmalloc(sizeof(*t), GFP_KERNEL);
>>>>>> @@ -1117,7 +1123,8 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
>>>>>>
>>>>>>         t->group = group;
>>>>>>         t->state = state;
>>>>>> -     t->threshold = threshold_us * NSEC_PER_USEC;
>>>>>> +     t->min_threshold = min_threshold_us * NSEC_PER_USEC;
>>>>>> +     t->max_threshold = max_threshold_us * NSEC_PER_USEC;
>>>>>>         t->win.size = window_us * NSEC_PER_USEC;
>>>>>>         window_reset(&t->win, 0, 0, 0);
>>>>>>
>>> .
> .


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-17 12:46       ` Chen Wandun
  2022-05-17 17:35         ` Suren Baghdasaryan
@ 2022-05-18 10:29         ` Alex Shi
  2022-05-18 21:38           ` Suren Baghdasaryan
  1 sibling, 1 reply; 17+ messages in thread
From: Alex Shi @ 2022-05-18 10:29 UTC (permalink / raw)
  To: Chen Wandun, Suren Baghdasaryan
  Cc: LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



On 5/17/22 20:46, Chen Wandun wrote:
>>>> This breaks the old ABI. And why you need this new function?
>>> Both great points.
>> BTW, I think the additional max_threshold parameter could be
>> implemented in a backward compatible way so that the old API is not
>> broken:
>>
>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
>> if (arg_count < 2) return ERR_PTR(-EINVAL);
>> if (arg_count < 3) {
>>      max_threshold_us = INT_MAX;
>>      window_us = arg2;
>> } else {
>>      max_threshold_us = arg2;
>>      window_us = arg3;
>> }
> OK
> 
> Thanks.
>> But again, the motivation still needs to be explained.
> we want do different operation for different stall level,
> just as prev email explain, multi trigger is also OK in old
> ways, but it is a litter complex.

In fact, I am not keen for this solution, the older and newer
interface is easy to be confused by users, for some resolvable
unclear issues. It's not a good idea.

Thanks
Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-18 10:29         ` Alex Shi
@ 2022-05-18 21:38           ` Suren Baghdasaryan
  2022-05-19  6:15             ` Alex Shi
  0 siblings, 1 reply; 17+ messages in thread
From: Suren Baghdasaryan @ 2022-05-18 21:38 UTC (permalink / raw)
  To: Alex Shi
  Cc: Chen Wandun, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION

On Wed, May 18, 2022 at 3:29 AM Alex Shi <seakeel@gmail.com> wrote:
>
>
>
> On 5/17/22 20:46, Chen Wandun wrote:
> >>>> This breaks the old ABI. And why you need this new function?
> >>> Both great points.
> >> BTW, I think the additional max_threshold parameter could be
> >> implemented in a backward compatible way so that the old API is not
> >> broken:
> >>
> >> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
> >> if (arg_count < 2) return ERR_PTR(-EINVAL);
> >> if (arg_count < 3) {
> >>      max_threshold_us = INT_MAX;
> >>      window_us = arg2;
> >> } else {
> >>      max_threshold_us = arg2;
> >>      window_us = arg3;
> >> }
> > OK
> >
> > Thanks.
> >> But again, the motivation still needs to be explained.
> > we want do different operation for different stall level,
> > just as prev email explain, multi trigger is also OK in old
> > ways, but it is a litter complex.
>
> In fact, I am not keen for this solution, the older and newer
> interface is easy to be confused by users, for some resolvable
> unclear issues. It's not a good idea.

Maybe adding the max_threshold as an optional last argument will be
less confusing? Smth like this:

some/full min_threshold window_size [max_threshold]

Also, if we do decide to add it, there should be a warning in the
documentation that max_threshold usage might lead to a stall being
missed completely. In your example:

echo "some 150000 350000 1000000" > /proc/pressure/memory

If there is a stall of more than 350ms within a given window, that
trigger will not fire at all.
Thanks,
Suren.

>
> Thanks
> Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-18 21:38           ` Suren Baghdasaryan
@ 2022-05-19  6:15             ` Alex Shi
  2022-05-19 18:34               ` Suren Baghdasaryan
  2022-05-21  7:23               ` Chen Wandun
  0 siblings, 2 replies; 17+ messages in thread
From: Alex Shi @ 2022-05-19  6:15 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Chen Wandun, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



On 5/19/22 05:38, Suren Baghdasaryan wrote:
> On Wed, May 18, 2022 at 3:29 AM Alex Shi <seakeel@gmail.com> wrote:
>>
>>
>>
>> On 5/17/22 20:46, Chen Wandun wrote:
>>>>>> This breaks the old ABI. And why you need this new function?
>>>>> Both great points.
>>>> BTW, I think the additional max_threshold parameter could be
>>>> implemented in a backward compatible way so that the old API is not
>>>> broken:
>>>>
>>>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
>>>> if (arg_count < 2) return ERR_PTR(-EINVAL);
>>>> if (arg_count < 3) {
>>>>      max_threshold_us = INT_MAX;
>>>>      window_us = arg2;
>>>> } else {
>>>>      max_threshold_us = arg2;
>>>>      window_us = arg3;
>>>> }
>>> OK
>>>
>>> Thanks.
>>>> But again, the motivation still needs to be explained.
>>> we want do different operation for different stall level,
>>> just as prev email explain, multi trigger is also OK in old
>>> ways, but it is a litter complex.
>>
>> In fact, I am not keen for this solution, the older and newer
>> interface is easy to be confused by users, for some resolvable
>> unclear issues. It's not a good idea.
> 
> Maybe adding the max_threshold as an optional last argument will be
> less confusing? Smth like this:
> 
> some/full min_threshold window_size [max_threshold]

It's already confused enough. :)
BTW, I still don't see the strong reason for the pressure range.

> > Also, if we do decide to add it, there should be a warning in the
> documentation that max_threshold usage might lead to a stall being
> missed completely. In your example:
> 
> echo "some 150000 350000 1000000" > /proc/pressure/memory
> 
> If there is a stall of more than 350ms within a given window, that
> trigger will not fire at all.

Right. 
And what if others propose more pressure combinations?
Maybe leave them to user space is more likely workable?

Thanks
Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-19  6:15             ` Alex Shi
@ 2022-05-19 18:34               ` Suren Baghdasaryan
  2022-05-21  7:23               ` Chen Wandun
  1 sibling, 0 replies; 17+ messages in thread
From: Suren Baghdasaryan @ 2022-05-19 18:34 UTC (permalink / raw)
  To: Alex Shi
  Cc: Chen Wandun, LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION

On Wed, May 18, 2022 at 11:15 PM Alex Shi <seakeel@gmail.com> wrote:
>
>
>
> On 5/19/22 05:38, Suren Baghdasaryan wrote:
> > On Wed, May 18, 2022 at 3:29 AM Alex Shi <seakeel@gmail.com> wrote:
> >>
> >>
> >>
> >> On 5/17/22 20:46, Chen Wandun wrote:
> >>>>>> This breaks the old ABI. And why you need this new function?
> >>>>> Both great points.
> >>>> BTW, I think the additional max_threshold parameter could be
> >>>> implemented in a backward compatible way so that the old API is not
> >>>> broken:
> >>>>
> >>>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
> >>>> if (arg_count < 2) return ERR_PTR(-EINVAL);
> >>>> if (arg_count < 3) {
> >>>>      max_threshold_us = INT_MAX;
> >>>>      window_us = arg2;
> >>>> } else {
> >>>>      max_threshold_us = arg2;
> >>>>      window_us = arg3;
> >>>> }
> >>> OK
> >>>
> >>> Thanks.
> >>>> But again, the motivation still needs to be explained.
> >>> we want do different operation for different stall level,
> >>> just as prev email explain, multi trigger is also OK in old
> >>> ways, but it is a litter complex.
> >>
> >> In fact, I am not keen for this solution, the older and newer
> >> interface is easy to be confused by users, for some resolvable
> >> unclear issues. It's not a good idea.
> >
> > Maybe adding the max_threshold as an optional last argument will be
> > less confusing? Smth like this:
> >
> > some/full min_threshold window_size [max_threshold]
>
> It's already confused enough. :)
> BTW, I still don't see the strong reason for the pressure range.
>
> > > Also, if we do decide to add it, there should be a warning in the
> > documentation that max_threshold usage might lead to a stall being
> > missed completely. In your example:
> >
> > echo "some 150000 350000 1000000" > /proc/pressure/memory
> >
> > If there is a stall of more than 350ms within a given window, that
> > trigger will not fire at all.
>
> Right.
> And what if others propose more pressure combinations?
> Maybe leave them to user space is more likely workable?

Ok, sounds like userspace can handle the situation of multiple
triggers firing. Let's keep it simple as it is now, until we see a
strong need or convincing performance numbers for adding this new
trigger attribute.
Chen, if you can provide reasons why a userspace solution would be
prohibitive I would be happy to reconsider.
Thanks,
Suren.

>
> Thanks
> Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-19  6:15             ` Alex Shi
  2022-05-19 18:34               ` Suren Baghdasaryan
@ 2022-05-21  7:23               ` Chen Wandun
  2022-05-21 10:13                 ` Alex Shi
  1 sibling, 1 reply; 17+ messages in thread
From: Chen Wandun @ 2022-05-21  7:23 UTC (permalink / raw)
  To: Alex Shi, Suren Baghdasaryan
  Cc: LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



在 2022/5/19 14:15, Alex Shi 写道:
>
> On 5/19/22 05:38, Suren Baghdasaryan wrote:
>> On Wed, May 18, 2022 at 3:29 AM Alex Shi <seakeel@gmail.com> wrote:
>>>
>>>
>>> On 5/17/22 20:46, Chen Wandun wrote:
>>>>>>> This breaks the old ABI. And why you need this new function?
>>>>>> Both great points.
>>>>> BTW, I think the additional max_threshold parameter could be
>>>>> implemented in a backward compatible way so that the old API is not
>>>>> broken:
>>>>>
>>>>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
>>>>> if (arg_count < 2) return ERR_PTR(-EINVAL);
>>>>> if (arg_count < 3) {
>>>>>       max_threshold_us = INT_MAX;
>>>>>       window_us = arg2;
>>>>> } else {
>>>>>       max_threshold_us = arg2;
>>>>>       window_us = arg3;
>>>>> }
>>>> OK
>>>>
>>>> Thanks.
>>>>> But again, the motivation still needs to be explained.
>>>> we want do different operation for different stall level,
>>>> just as prev email explain, multi trigger is also OK in old
>>>> ways, but it is a litter complex.
>>> In fact, I am not keen for this solution, the older and newer
>>> interface is easy to be confused by users, for some resolvable
>>> unclear issues. It's not a good idea.
>> Maybe adding the max_threshold as an optional last argument will be
>> less confusing? Smth like this:
>>
>> some/full min_threshold window_size [max_threshold]
> It's already confused enough. :)
which point make you confused?
Interface suggest by Suren is compatible with current version,
I think it is more reasonable and there is no difficuty to understand it.
> BTW, I still don't see the strong reason for the pressure range.
Considering this case:
I divide pressure into multi levels, and each level corresponds to a
hander,  I have to register multi triggers and wait for fire events,
nowadays, these trigger is something like:
echo “some 150000 1000000” > /proc/pressure/memory
echo “some 350000 1000000” > /proc/pressure/memory
echo “some 550000 1000000” > /proc/pressure/memory
echo “some 750000 1000000” > /proc/pressure/memory

In the best case, stall pressure between 150000 and 350000,
only one trigger fire, and only one wakeup.

In any other case,  multi triggers fire and multi wakeup, but it
indeed is no need.

New implement make the fire and wakeup more precise,
userspace code will be more simple, no confusing fire event,
no need to filter fire event anymore, maybe minor performance
improved.

Thanks.
>
>>> Also, if we do decide to add it, there should be a warning in the
>> documentation that max_threshold usage might lead to a stall being
>> missed completely. In your example:
>>
>> echo "some 150000 350000 1000000" > /proc/pressure/memory
>>
>> If there is a stall of more than 350ms within a given window, that
>> trigger will not fire at all.
> Right.
> And what if others propose more pressure combinations?
> Maybe leave them to user space is more likely workable?
>
> Thanks
> Alex
> .


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-21  7:23               ` Chen Wandun
@ 2022-05-21 10:13                 ` Alex Shi
  2022-05-21 10:34                   ` Chen Wandun
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Shi @ 2022-05-21 10:13 UTC (permalink / raw)
  To: Chen Wandun, Suren Baghdasaryan
  Cc: LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



On 5/21/22 15:23, Chen Wandun wrote:
> 
> 
> 在 2022/5/19 14:15, Alex Shi 写道:
>>
>> On 5/19/22 05:38, Suren Baghdasaryan wrote:
>>> On Wed, May 18, 2022 at 3:29 AM Alex Shi <seakeel@gmail.com> wrote:
>>>>
>>>>
>>>> On 5/17/22 20:46, Chen Wandun wrote:
>>>>>>>> This breaks the old ABI. And why you need this new function?
>>>>>>> Both great points.
>>>>>> BTW, I think the additional max_threshold parameter could be
>>>>>> implemented in a backward compatible way so that the old API is not
>>>>>> broken:
>>>>>>
>>>>>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
>>>>>> if (arg_count < 2) return ERR_PTR(-EINVAL);
>>>>>> if (arg_count < 3) {
>>>>>>       max_threshold_us = INT_MAX;
>>>>>>       window_us = arg2;
>>>>>> } else {
>>>>>>       max_threshold_us = arg2;
>>>>>>       window_us = arg3;
>>>>>> }
>>>>> OK
>>>>>
>>>>> Thanks.
>>>>>> But again, the motivation still needs to be explained.
>>>>> we want do different operation for different stall level,
>>>>> just as prev email explain, multi trigger is also OK in old
>>>>> ways, but it is a litter complex.
>>>> In fact, I am not keen for this solution, the older and newer
>>>> interface is easy to be confused by users, for some resolvable
>>>> unclear issues. It's not a good idea.
>>> Maybe adding the max_threshold as an optional last argument will be
>>> less confusing? Smth like this:
>>>
>>> some/full min_threshold window_size [max_threshold]
>> It's already confused enough. :)
> which point make you confused?
> Interface suggest by Suren is compatible with current version,
> I think it is more reasonable and there is no difficuty to understand it.

Your 3rd parameter has different meaning depends on the exists or non-exist
4th one. It's not a good design. 

>> BTW, I still don't see the strong reason for the pressure range.
> Considering this case:
> I divide pressure into multi levels, and each level corresponds to a
> hander,  I have to register multi triggers and wait for fire events,
> nowadays, these trigger is something like:
> echo “some 150000 1000000” > /proc/pressure/memory
> echo “some 350000 1000000” > /proc/pressure/memory
> echo “some 550000 1000000” > /proc/pressure/memory
> echo “some 750000 1000000” > /proc/pressure/memory
> 
> In the best case, stall pressure between 150000 and 350000,
> only one trigger fire, and only one wakeup.
> 
> In any other case,  multi triggers fire and multi wakeup, but it
> indeed is no need.
> 

Could you give more details info to show what detailed problem
which your propose could address, but current code cannot?


Thanks
Alex

> New implement make the fire and wakeup more precise,
> userspace code will be more simple, no confusing fire event,
> no need to filter fire event anymore, maybe minor performance
> improved.
> 
> Thanks.
>>
>>>> Also, if we do decide to add it, there should be a warning in the
>>> documentation that max_threshold usage might lead to a stall being
>>> missed completely. In your example:
>>>
>>> echo "some 150000 350000 1000000" > /proc/pressure/memory
>>>
>>> If there is a stall of more than 350ms within a given window, that
>>> trigger will not fire at all.
>> Right.
>> And what if others propose more pressure combinations?
>> Maybe leave them to user space is more likely workable?
>>
>> Thanks
>> Alex
>> .
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] psi: add support for multi level pressure stall trigger
  2022-05-21 10:13                 ` Alex Shi
@ 2022-05-21 10:34                   ` Chen Wandun
  0 siblings, 0 replies; 17+ messages in thread
From: Chen Wandun @ 2022-05-21 10:34 UTC (permalink / raw)
  To: Alex Shi, Suren Baghdasaryan
  Cc: LKML, Johannes Weiner, Alex Shi, Jonathan Corbet,
	open list:DOCUMENTATION



在 2022/5/21 18:13, Alex Shi 写道:
>
> On 5/21/22 15:23, Chen Wandun wrote:
>>
>> 在 2022/5/19 14:15, Alex Shi 写道:
>>> On 5/19/22 05:38, Suren Baghdasaryan wrote:
>>>> On Wed, May 18, 2022 at 3:29 AM Alex Shi <seakeel@gmail.com> wrote:
>>>>>
>>>>> On 5/17/22 20:46, Chen Wandun wrote:
>>>>>>>>> This breaks the old ABI. And why you need this new function?
>>>>>>>> Both great points.
>>>>>>> BTW, I think the additional max_threshold parameter could be
>>>>>>> implemented in a backward compatible way so that the old API is not
>>>>>>> broken:
>>>>>>>
>>>>>>> arg_count = sscanf(buf, "some %u %u %u", &min_threshold_us,  &arg2, &arg3);
>>>>>>> if (arg_count < 2) return ERR_PTR(-EINVAL);
>>>>>>> if (arg_count < 3) {
>>>>>>>        max_threshold_us = INT_MAX;
>>>>>>>        window_us = arg2;
>>>>>>> } else {
>>>>>>>        max_threshold_us = arg2;
>>>>>>>        window_us = arg3;
>>>>>>> }
>>>>>> OK
>>>>>>
>>>>>> Thanks.
>>>>>>> But again, the motivation still needs to be explained.
>>>>>> we want do different operation for different stall level,
>>>>>> just as prev email explain, multi trigger is also OK in old
>>>>>> ways, but it is a litter complex.
>>>>> In fact, I am not keen for this solution, the older and newer
>>>>> interface is easy to be confused by users, for some resolvable
>>>>> unclear issues. It's not a good idea.
>>>> Maybe adding the max_threshold as an optional last argument will be
>>>> less confusing? Smth like this:
>>>>
>>>> some/full min_threshold window_size [max_threshold]
>>> It's already confused enough. :)
>> which point make you confused?
>> Interface suggest by Suren is compatible with current version,
>> I think it is more reasonable and there is no difficuty to understand it.
> Your 3rd parameter has different meaning depends on the exists or non-exist
> 4th one. It's not a good design.
>
some/full min_threshold window_size [max_threshold]

In this format, the meaning of 3rd parameter keep unchanged.
This format is compatible with current version.

>>> BTW, I still don't see the strong reason for the pressure range.
>> Considering this case:
>> I divide pressure into multi levels, and each level corresponds to a
>> hander,  I have to register multi triggers and wait for fire events,
>> nowadays, these trigger is something like:
>> echo “some 150000 1000000” > /proc/pressure/memory
>> echo “some 350000 1000000” > /proc/pressure/memory
>> echo “some 550000 1000000” > /proc/pressure/memory
>> echo “some 750000 1000000” > /proc/pressure/memory
>>
>> In the best case, stall pressure between 150000 and 350000,
>> only one trigger fire, and only one wakeup.
>>
>> In any other case,  multi triggers fire and multi wakeup, but it
>> indeed is no need.
>>
> Could you give more details info to show what detailed problem
> which your propose could address, but current code cannot?
Current code also can handle this, but thing become complex,jsut
as explained above.

Thanks
Wandun
>
> Thanks
> Alex
>
>> New implement make the fire and wakeup more precise,
>> userspace code will be more simple, no confusing fire event,
>> no need to filter fire event anymore, maybe minor performance
>> improved.
>>
>> Thanks.
>>>>> Also, if we do decide to add it, there should be a warning in the
>>>> documentation that max_threshold usage might lead to a stall being
>>>> missed completely. In your example:
>>>>
>>>> echo "some 150000 350000 1000000" > /proc/pressure/memory
>>>>
>>>> If there is a stall of more than 350ms within a given window, that
>>>> trigger will not fire at all.
>>> Right.
>>> And what if others propose more pressure combinations?
>>> Maybe leave them to user space is more likely workable?
>>>
>>> Thanks
>>> Alex
>>> .
> .


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-05-21 10:34 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-16  3:35 [PATCH 1/2] psi: add support for multi level pressure stall trigger Chen Wandun
2022-05-16  3:35 ` [PATCH 2/2] psi: add description about multi level pressure trigger Chen Wandun
2022-05-16  6:20 ` [PATCH 1/2] psi: add support for multi level pressure stall trigger Alex Shi
2022-05-16  8:21   ` Suren Baghdasaryan
2022-05-16  8:43     ` Suren Baghdasaryan
2022-05-17 12:46       ` Chen Wandun
2022-05-17 17:35         ` Suren Baghdasaryan
2022-05-18  9:55           ` Chen Wandun
2022-05-18 10:29         ` Alex Shi
2022-05-18 21:38           ` Suren Baghdasaryan
2022-05-19  6:15             ` Alex Shi
2022-05-19 18:34               ` Suren Baghdasaryan
2022-05-21  7:23               ` Chen Wandun
2022-05-21 10:13                 ` Alex Shi
2022-05-21 10:34                   ` Chen Wandun
2022-05-17  9:38     ` Chen Wandun
2022-05-17  9:08   ` Chen Wandun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.