linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm, oom: enable rate-limiting controls for oom dumps
@ 2020-10-09  9:30 Ricardo Cañuelo
  2020-10-12 15:18 ` Michal Hocko
  2020-10-12 15:22 ` Petr Mladek
  0 siblings, 2 replies; 12+ messages in thread
From: Ricardo Cañuelo @ 2020-10-09  9:30 UTC (permalink / raw)
  To: akpm; +Cc: kernel, hch, guro, rientjes, mcgrof, keescook, yzaikin, linux-mm

Add two sysctl entries (vm.oom_dump_ratelimit and
vm.oom_dump_ratelimit_burst) to control the rate limiting interval and
burst, respectively, of the OOM killer output (oom_kill_process()).

These entries are disabled by default and can be enabled during kernel
configuration with CONFIG_DUMP_RATELIMIT. They take
DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST as their default
values.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
---

In some setups, the amount of output that the OOM killer generates when
it kills a process and dumps the list of tasks can be too
large. Unfortunately, the rate-limiting used for it uses the default
values for the rate limit interval and burst. This patch allows the user
to configure these values.

I created a new menu selection inside "printk and dmesg options" for
this option. Many other parts of the kernel use either hardcoded or
default values for the rate limiting parameters. If, in the future, more
of these cases need to be parameterized, this new submenu can be used to
hold any new config option related to rate limiting.

 include/linux/oom.h |  2 ++
 kernel/sysctl.c     | 16 ++++++++++++++++
 lib/Kconfig.debug   | 13 +++++++++++++
 mm/oom_kill.c       |  5 +++++
 4 files changed, 36 insertions(+)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 2db9a1432511..76c560a1a4c7 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -127,4 +127,6 @@ extern struct task_struct *find_lock_task_mm(struct task_struct *p);
 extern int sysctl_oom_dump_tasks;
 extern int sysctl_oom_kill_allocating_task;
 extern int sysctl_panic_on_oom;
+extern int sysctl_oom_dump_ratelimit;
+extern int sysctl_oom_dump_ratelimit_burst;
 #endif /* _INCLUDE_LINUX_OOM_H */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ce75c67572b9..d348eac7e561 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2708,6 +2708,22 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+#ifdef CONFIG_OOM_DUMP_RATELIMIT
+	{
+		.procname	= "oom_dump_ratelimit",
+		.data		= &sysctl_oom_dump_ratelimit,
+		.maxlen		= sizeof(sysctl_oom_dump_ratelimit),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+	{
+		.procname	= "oom_dump_ratelimit_burst",
+		.data		= &sysctl_oom_dump_ratelimit_burst,
+		.maxlen		= sizeof(sysctl_oom_dump_ratelimit_burst),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+#endif
 	{
 		.procname	= "overcommit_ratio",
 		.data		= &sysctl_overcommit_ratio,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index fde620501707..c57233e56d01 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -195,6 +195,19 @@ config DEBUG_BUGVERBOSE
 	  of the BUG call as well as the EIP and oops trace.  This aids
 	  debugging but costs about 70-100K of memory.
 
+menu "Rate limiting"
+
+config OOM_DUMP_RATELIMIT
+	bool "Enable rate limiting for OOM dumps"
+	default n
+	help
+	  Say Y here to enable the configuration of the rate limiting of
+	  OOM task dumps to the console through sysctl entries
+	  vm.oom_dump_ratelimit (rate limit interval) and
+	  vm.oom_dump_ratelimit_burst (rate limit burst).
+
+endmenu # "Rate limiting"
+
 endmenu # "printk and dmesg options"
 
 menu "Compile-time checks and compiler options"
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8b84661a6410..2b3fa826a172 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -54,6 +54,8 @@
 int sysctl_panic_on_oom;
 int sysctl_oom_kill_allocating_task;
 int sysctl_oom_dump_tasks = 1;
+int sysctl_oom_dump_ratelimit = DEFAULT_RATELIMIT_INTERVAL;
+int sysctl_oom_dump_ratelimit_burst = DEFAULT_RATELIMIT_BURST;
 
 /*
  * Serializes oom killer invocations (out_of_memory()) from all contexts to
@@ -959,6 +961,9 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
 	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
 					      DEFAULT_RATELIMIT_BURST);
 
+	oom_rs.interval = sysctl_oom_dump_ratelimit;
+	oom_rs.burst = sysctl_oom_dump_ratelimit_burst;
+
 	/*
 	 * If the task is already exiting, don't alarm the sysadmin or kill
 	 * its children or threads, just give it access to memory reserves
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-09  9:30 [PATCH] mm, oom: enable rate-limiting controls for oom dumps Ricardo Cañuelo
@ 2020-10-12 15:18 ` Michal Hocko
  2020-10-13  9:23   ` Ricardo Cañuelo
  2020-10-12 15:22 ` Petr Mladek
  1 sibling, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2020-10-12 15:18 UTC (permalink / raw)
  To: Ricardo Cañuelo
  Cc: akpm, kernel, hch, guro, rientjes, mcgrof, keescook, yzaikin, linux-mm

On Fri 09-10-20 11:30:14, Ricardo Cañuelo wrote:
> Add two sysctl entries (vm.oom_dump_ratelimit and
> vm.oom_dump_ratelimit_burst) to control the rate limiting interval and
> burst, respectively, of the OOM killer output (oom_kill_process()).
> 
> These entries are disabled by default and can be enabled during kernel
> configuration with CONFIG_DUMP_RATELIMIT. They take
> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST as their default
> values.
> 
> Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
> ---
> 
> In some setups, the amount of output that the OOM killer generates when
> it kills a process and dumps the list of tasks can be too
> large. Unfortunately, the rate-limiting used for it uses the default
> values for the rate limit interval and burst. This patch allows the user
> to configure these values.

How does controling burst and interval solve the problem? Btw. this
should be a part of the changelog which should explain not only what but
also why the change is needed.

It is true that the oom report can generate a lot of output. This is
something that is brought up for quite some time. The largest part of
the output tends to be the list of tasks and this seems to be the case
for you as well. Is the list of tasks so important that you need to have
it in the log? If not you can simply disable this part of the log
altogether by /proc/sys/vm/oom_dump_tasks.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-09  9:30 [PATCH] mm, oom: enable rate-limiting controls for oom dumps Ricardo Cañuelo
  2020-10-12 15:18 ` Michal Hocko
@ 2020-10-12 15:22 ` Petr Mladek
  2020-10-12 15:41   ` Michal Hocko
  2020-10-13  9:18   ` Ricardo Cañuelo
  1 sibling, 2 replies; 12+ messages in thread
From: Petr Mladek @ 2020-10-12 15:22 UTC (permalink / raw)
  To: Ricardo Cañuelo, akpm
  Cc: kernel, hch, guro, rientjes, mcgrof, keescook, yzaikin, linux-mm,
	mhocko, Sergey Senozhatsky, Steven Rostedt

> Add two sysctl entries (vm.oom_dump_ratelimit and
> vm.oom_dump_ratelimit_burst) to control the rate limiting interval and
> burst, respectively, of the OOM killer output (oom_kill_process()).
> 
> These entries are disabled by default and can be enabled during kernel
> configuration with CONFIG_DUMP_RATELIMIT. They take
> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST as their default
> values.
> 
> Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
> ---
> 
> In some setups, the amount of output that the OOM killer generates when
> it kills a process and dumps the list of tasks can be too
> large. Unfortunately, the rate-limiting used for it uses the default
> values for the rate limit interval and burst. This patch allows the user
> to configure these values.

It might be pretty hard to set any reasonable values. It depends on
the console speed and the amount of processes on the system. I wonder
who many people would be able to use it in reality.

What about introducing some feedback from the printk code?

     static u64 printk_last_report_seq;

     if (consoles_seen(printk_last_report_seq)) {
	dump_header();
	printk_last_report_seq = printk_get_last_seq();
     }

By other words. It would skip the massive report when the consoles
were not able to see the previous one.


> I created a new menu selection inside "printk and dmesg options" for
> this option. Many other parts of the kernel use either hardcoded or
> default values for the rate limiting parameters. If, in the future, more
> of these cases need to be parameterized, this new submenu can be used to
> hold any new config option related to rate limiting.
> 
>  include/linux/oom.h |  2 ++
>  kernel/sysctl.c     | 16 ++++++++++++++++
>  lib/Kconfig.debug   | 13 +++++++++++++
>  mm/oom_kill.c       |  5 +++++
>  4 files changed, 36 insertions(+)
> 
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 2db9a1432511..76c560a1a4c7 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -127,4 +127,6 @@ extern struct task_struct *find_lock_task_mm(struct task_struct *p);
>  extern int sysctl_oom_dump_tasks;
>  extern int sysctl_oom_kill_allocating_task;
>  extern int sysctl_panic_on_oom;
> +extern int sysctl_oom_dump_ratelimit;
> +extern int sysctl_oom_dump_ratelimit_burst;
>  #endif /* _INCLUDE_LINUX_OOM_H */
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index ce75c67572b9..d348eac7e561 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -2708,6 +2708,22 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec,
>  	},
> +#ifdef CONFIG_OOM_DUMP_RATELIMIT

I do not see a reason to have this build configurable. The options are
either useful or not.

> +	{
> +		.procname	= "oom_dump_ratelimit",
> +		.data		= &sysctl_oom_dump_ratelimit,
> +		.maxlen		= sizeof(sysctl_oom_dump_ratelimit),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec,
> +	},

> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -54,6 +54,8 @@
>  int sysctl_panic_on_oom;
>  int sysctl_oom_kill_allocating_task;
>  int sysctl_oom_dump_tasks = 1;
> +int sysctl_oom_dump_ratelimit = DEFAULT_RATELIMIT_INTERVAL;
> +int sysctl_oom_dump_ratelimit_burst = DEFAULT_RATELIMIT_BURST;
>  
>  /*
>   * Serializes oom killer invocations (out_of_memory()) from all contexts to
> @@ -959,6 +961,9 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
>  	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
>  					      DEFAULT_RATELIMIT_BURST);
>  
> +	oom_rs.interval = sysctl_oom_dump_ratelimit;
> +	oom_rs.burst = sysctl_oom_dump_ratelimit_burst;

Why is _interval suffix omitted in the first variable? I find this
pretty confusing.

Best Regards,
Petr


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-12 15:22 ` Petr Mladek
@ 2020-10-12 15:41   ` Michal Hocko
  2020-10-13  0:40     ` Tetsuo Handa
  2020-10-13  9:18   ` Ricardo Cañuelo
  1 sibling, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2020-10-12 15:41 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Ricardo Cañuelo, akpm, kernel, hch, guro, rientjes, mcgrof,
	keescook, yzaikin, linux-mm, Sergey Senozhatsky, Steven Rostedt

On Mon 12-10-20 17:22:32, Petr Mladek wrote:
> > Add two sysctl entries (vm.oom_dump_ratelimit and
> > vm.oom_dump_ratelimit_burst) to control the rate limiting interval and
> > burst, respectively, of the OOM killer output (oom_kill_process()).
> > 
> > These entries are disabled by default and can be enabled during kernel
> > configuration with CONFIG_DUMP_RATELIMIT. They take
> > DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST as their default
> > values.
> > 
> > Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
> > ---
> > 
> > In some setups, the amount of output that the OOM killer generates when
> > it kills a process and dumps the list of tasks can be too
> > large. Unfortunately, the rate-limiting used for it uses the default
> > values for the rate limit interval and burst. This patch allows the user
> > to configure these values.
> 
> It might be pretty hard to set any reasonable values. It depends on
> the console speed and the amount of processes on the system. I wonder
> who many people would be able to use it in reality.

Agreed!

> What about introducing some feedback from the printk code?
> 
>      static u64 printk_last_report_seq;
> 
>      if (consoles_seen(printk_last_report_seq)) {
> 	dump_header();
> 	printk_last_report_seq = printk_get_last_seq();
>      }
> 
> By other words. It would skip the massive report when the consoles
> were not able to see the previous one.

I am pretty sure this has been discussed in the past but maybe we really
want to make ratelimit to work reasonably also for larger sections
instead. Current implementation only really works if the rate limited
operation is negligible wrt to the interval. Can we have a ratelimit
alternative with a scope effect (effectivelly lock like semantic)?
	if (rate_limit_begin(&oom_rs)) {
		dump_header();
		rate_limit_end(&oom_rs);
	}

rate_limi_begin would act like a try lock with additional constrain on
the period/cadence based on rate_limi_end marked values.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-12 15:41   ` Michal Hocko
@ 2020-10-13  0:40     ` Tetsuo Handa
  2020-10-13  7:25       ` Michal Hocko
  2020-10-13  9:02       ` Petr Mladek
  0 siblings, 2 replies; 12+ messages in thread
From: Tetsuo Handa @ 2020-10-13  0:40 UTC (permalink / raw)
  To: Ricardo Cañuelo
  Cc: Michal Hocko, Petr Mladek, akpm, kernel, hch, guro, rientjes,
	mcgrof, keescook, yzaikin, linux-mm, Sergey Senozhatsky,
	Steven Rostedt

On 2020/10/13 0:41, Michal Hocko wrote:
>> What about introducing some feedback from the printk code?
>>
>>      static u64 printk_last_report_seq;
>>
>>      if (consoles_seen(printk_last_report_seq)) {
>> 	dump_header();
>> 	printk_last_report_seq = printk_get_last_seq();
>>      }
>>
>> By other words. It would skip the massive report when the consoles
>> were not able to see the previous one.
> 
> I am pretty sure this has been discussed in the past but maybe we really
> want to make ratelimit to work reasonably also for larger sections
> instead. Current implementation only really works if the rate limited
> operation is negligible wrt to the interval. Can we have a ratelimit
> alternative with a scope effect (effectivelly lock like semantic)?
> 	if (rate_limit_begin(&oom_rs)) {
> 		dump_header();
> 		rate_limit_end(&oom_rs);
> 	}
> 
> rate_limi_begin would act like a try lock with additional constrain on
> the period/cadence based on rate_limi_end marked values.
> 

Here is one of past discussions.

  https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@i-love.sakura.ne.jp
  https://lkml.kernel.org/r/20190830103504.GA28313@dhcp22.suse.cz
  https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@i-love.sakura.ne.jp

Michal Hocko complained about different OOM domains, and now just ignores it...

Proper ratelimiting for OOM messages had better not to count on asynchronous printk().


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-13  0:40     ` Tetsuo Handa
@ 2020-10-13  7:25       ` Michal Hocko
  2020-10-13  9:02       ` Petr Mladek
  1 sibling, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2020-10-13  7:25 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ricardo Cañuelo, Petr Mladek, akpm, kernel, hch, guro,
	rientjes, mcgrof, keescook, yzaikin, linux-mm,
	Sergey Senozhatsky, Steven Rostedt

On Tue 13-10-20 09:40:27, Tetsuo Handa wrote:
> On 2020/10/13 0:41, Michal Hocko wrote:
> >> What about introducing some feedback from the printk code?
> >>
> >>      static u64 printk_last_report_seq;
> >>
> >>      if (consoles_seen(printk_last_report_seq)) {
> >> 	dump_header();
> >> 	printk_last_report_seq = printk_get_last_seq();
> >>      }
> >>
> >> By other words. It would skip the massive report when the consoles
> >> were not able to see the previous one.
> > 
> > I am pretty sure this has been discussed in the past but maybe we really
> > want to make ratelimit to work reasonably also for larger sections
> > instead. Current implementation only really works if the rate limited
> > operation is negligible wrt to the interval. Can we have a ratelimit
> > alternative with a scope effect (effectivelly lock like semantic)?
> > 	if (rate_limit_begin(&oom_rs)) {
> > 		dump_header();
> > 		rate_limit_end(&oom_rs);
> > 	}
> > 
> > rate_limi_begin would act like a try lock with additional constrain on
> > the period/cadence based on rate_limi_end marked values.
> > 
> 
> Here is one of past discussions.
> 
>   https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@i-love.sakura.ne.jp
>   https://lkml.kernel.org/r/20190830103504.GA28313@dhcp22.suse.cz
>   https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@i-love.sakura.ne.jp
> 
> Michal Hocko complained about different OOM domains, and now just ignores it...

None of the above deals with the amount of printed data which is the
point of this patch AFAIU.

> Proper ratelimiting for OOM messages had better not to count on asynchronous printk().

Because?

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-13  0:40     ` Tetsuo Handa
  2020-10-13  7:25       ` Michal Hocko
@ 2020-10-13  9:02       ` Petr Mladek
  2020-10-13 10:46         ` Tetsuo Handa
  1 sibling, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2020-10-13  9:02 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ricardo Cañuelo, Michal Hocko, akpm, kernel, hch, guro,
	rientjes, mcgrof, keescook, yzaikin, linux-mm,
	Sergey Senozhatsky, Steven Rostedt

On Tue 2020-10-13 09:40:27, Tetsuo Handa wrote:
> On 2020/10/13 0:41, Michal Hocko wrote:
> >> What about introducing some feedback from the printk code?
> >>
> >>      static u64 printk_last_report_seq;
> >>
> >>      if (consoles_seen(printk_last_report_seq)) {
> >> 	dump_header();
> >> 	printk_last_report_seq = printk_get_last_seq();
> >>      }
> >>
> >> By other words. It would skip the massive report when the consoles
> >> were not able to see the previous one.
> > 
> > I am pretty sure this has been discussed in the past but maybe we really
> > want to make ratelimit to work reasonably also for larger sections
> > instead. Current implementation only really works if the rate limited
> > operation is negligible wrt to the interval. Can we have a ratelimit
> > alternative with a scope effect (effectivelly lock like semantic)?
> > 	if (rate_limit_begin(&oom_rs)) {
> > 		dump_header();
> > 		rate_limit_end(&oom_rs);
> > 	}
> > 
> > rate_limi_begin would act like a try lock with additional constrain on
> > the period/cadence based on rate_limi_end marked values.
> > 
> 
> Here is one of past discussions.
> 
>   https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@i-love.sakura.ne.jp
>   https://lkml.kernel.org/r/20190830103504.GA28313@dhcp22.suse.cz
>   https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@i-love.sakura.ne.jp
> 
> Michal Hocko complained about different OOM domains, and now just ignores it...

How is this related to this discussion, please? AFAIK, we are
discussing how to tune the values of the existing ratelimiting.

> Proper ratelimiting for OOM messages had better not to count on asynchronous printk().

I am a bit confused. AFAIK, you wanted to print OOM messages
asynchronous ways in the past. The lockless printk ringbuffer is on
its way into 5.10. Handling consoles in kthreads will be the next
step of the printk rework.

OK, the current state is that printk() is semi-synchronous. It does
console_trylock(). The console is handled immediately when it
succeeds. Otherwise it expects that the current console_lock owner
would do the job.

Tuning ratelimits is not trivial for a particular system. It would
be better to have some autotuning. If the printk is synchronous,
we could measure how long the printing took. If it is asynchronous,
we could check whether the last report has been already flushed or
not. We could then decide whether to print the new report.

What is the desired behavior, please?

Could you please provide some examples how you would tune ratelimit
when printing all messages to the console takes X ms and OOM
happens every Y ms?

Best Regards,
Petr


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-12 15:22 ` Petr Mladek
  2020-10-12 15:41   ` Michal Hocko
@ 2020-10-13  9:18   ` Ricardo Cañuelo
  1 sibling, 0 replies; 12+ messages in thread
From: Ricardo Cañuelo @ 2020-10-13  9:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: akpm, kernel, hch, guro, rientjes, mcgrof, keescook, yzaikin,
	linux-mm, mhocko, Sergey Senozhatsky, Steven Rostedt

Hi Petr,

Thanks for taking the time to look into this.

On lun 12-10-2020 17:22:32, Petr Mladek wrote: 
> It might be pretty hard to set any reasonable values. It depends on
> the console speed and the amount of processes on the system. I wonder
> who many people would be able to use it in reality.

I agree that the interface is not obvious to use and is very
system-specific. But the idea was to reuse the same parameter interface
already used by printk_ratelimit. There certainly are some users that
want this, but maybe they'd be happy too with another alternative that
mitigates the problem of having too much OOM console output.

> What about introducing some feedback from the printk code?
> 
>      static u64 printk_last_report_seq;
> 
>      if (consoles_seen(printk_last_report_seq)) {
> 	dump_header();
> 	printk_last_report_seq = printk_get_last_seq();
>      }
> 
> By other words. It would skip the massive report when the consoles
> were not able to see the previous one.

Thanks for the suggestion. I'll take a closer look at the printk
implementation to see if this is a viable alternative.

> I do not see a reason to have this build configurable. The options are
> either useful or not.

I thought that this feature is maybe too specific to justify having two
new sysctl entries for everyone.

> Why is _interval suffix omitted in the first variable? I find this
> pretty confusing.

The name of the sysctl entries mimics those of printk_ratelimit and
printk_ratelimit_burst, I thought people would be familiar with these
already.

Cheers,
Ricardo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-12 15:18 ` Michal Hocko
@ 2020-10-13  9:23   ` Ricardo Cañuelo
  2020-10-13 11:56     ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Ricardo Cañuelo @ 2020-10-13  9:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, kernel, hch, guro, rientjes, mcgrof, keescook, yzaikin,
	linux-mm, sergey.senozhatsky, pmladek, rostedt

Hi Michal,

Thanks for reviewing the patch and for your suggestions:

On lun 12-10-2020 17:18:04, Michal Hocko wrote:
> How does controling burst and interval solve the problem? Btw. this
> should be a part of the changelog which should explain not only what but
> also why the change is needed.
> 
> It is true that the oom report can generate a lot of output. This is
> something that is brought up for quite some time. The largest part of
> the output tends to be the list of tasks and this seems to be the case
> for you as well. Is the list of tasks so important that you need to have
> it in the log? If not you can simply disable this part of the log
> altogether by /proc/sys/vm/oom_dump_tasks.

That could be another way to mitigate the problem, although losing some
information. I'll try to come up with a better alternative.

Cheers,
Ricardo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-13  9:02       ` Petr Mladek
@ 2020-10-13 10:46         ` Tetsuo Handa
  2020-10-15 13:05           ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2020-10-13 10:46 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Ricardo Cañuelo, Michal Hocko, akpm, kernel, hch, guro,
	rientjes, mcgrof, keescook, yzaikin, linux-mm,
	Sergey Senozhatsky, Steven Rostedt

On 2020/10/13 18:02, Petr Mladek wrote:
> On Tue 2020-10-13 09:40:27, Tetsuo Handa wrote:
>> On 2020/10/13 0:41, Michal Hocko wrote:
>>>> What about introducing some feedback from the printk code?
>>>>
>>>>      static u64 printk_last_report_seq;
>>>>
>>>>      if (consoles_seen(printk_last_report_seq)) {
>>>> 	dump_header();
>>>> 	printk_last_report_seq = printk_get_last_seq();
>>>>      }
>>>>
>>>> By other words. It would skip the massive report when the consoles
>>>> were not able to see the previous one.
>>>
>>> I am pretty sure this has been discussed in the past but maybe we really
>>> want to make ratelimit to work reasonably also for larger sections
>>> instead. Current implementation only really works if the rate limited
>>> operation is negligible wrt to the interval. Can we have a ratelimit
>>> alternative with a scope effect (effectivelly lock like semantic)?
>>> 	if (rate_limit_begin(&oom_rs)) {
>>> 		dump_header();
>>> 		rate_limit_end(&oom_rs);
>>> 	}
>>>
>>> rate_limi_begin would act like a try lock with additional constrain on
>>> the period/cadence based on rate_limi_end marked values.
>>>
>>
>> Here is one of past discussions.
>>
>>   https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@i-love.sakura.ne.jp
>>   https://lkml.kernel.org/r/20190830103504.GA28313@dhcp22.suse.cz
>>   https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@i-love.sakura.ne.jp
>>
>> Michal Hocko complained about different OOM domains, and now just ignores it...
> 
> How is this related to this discussion, please? AFAIK, we are
> discussing how to tune the values of the existing ratelimiting.

dump_tasks() is one of functions called from dump_header().

Since Michal wants to recognize OOM domains when ratelimiting dump_tasks(),
ratelimit for dump_header() is also expected to recognize OOM domains.

> 
>> Proper ratelimiting for OOM messages had better not to count on asynchronous printk().
> 
> I am a bit confused. AFAIK, you wanted to print OOM messages
> asynchronous ways in the past. The lockless printk ringbuffer is on
> its way into 5.10. Handling consoles in kthreads will be the next
> step of the printk rework.

What I'm proposing is synchronously printing OOM messages from a different
thread, for one dump_tasks() call can generate thousands of lines which may
significantly delay arrival of non OOM related messages to consoles (or even
drop due to logbuf being full). I don't want to enqueue too many OOM related
messages to logbuf, even after printk() became completely asynchronous.

> 
> OK, the current state is that printk() is semi-synchronous. It does
> console_trylock(). The console is handled immediately when it
> succeeds. Otherwise it expects that the current console_lock owner
> would do the job.
> 
> Tuning ratelimits is not trivial for a particular system. It would
> be better to have some autotuning. If the printk is synchronous,
> we could measure how long the printing took. If it is asynchronous,
> we could check whether the last report has been already flushed or
> not. We could then decide whether to print the new report.

Whether the last report has been already flushed needs to recognize
OOM domains.

> 
> What is the desired behavior, please?
> 
> Could you please provide some examples how you would tune ratelimit
> when printing all messages to the console takes X ms and OOM
> happens every Y ms?

My proposal is to decide whether to print the new report based on
whether all OOM candidates for that OOM domain have been flushed to
consoles. There is no X and Y.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-13  9:23   ` Ricardo Cañuelo
@ 2020-10-13 11:56     ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2020-10-13 11:56 UTC (permalink / raw)
  To: Ricardo Cañuelo
  Cc: akpm, kernel, hch, guro, rientjes, mcgrof, keescook, yzaikin,
	linux-mm, sergey.senozhatsky, pmladek, rostedt

On Tue 13-10-20 11:23:39, Ricardo Cañuelo wrote:
> Hi Michal,
> 
> Thanks for reviewing the patch and for your suggestions:
> 
> On lun 12-10-2020 17:18:04, Michal Hocko wrote:
> > How does controling burst and interval solve the problem? Btw. this
> > should be a part of the changelog which should explain not only what but
> > also why the change is needed.
> > 
> > It is true that the oom report can generate a lot of output. This is
> > something that is brought up for quite some time. The largest part of
> > the output tends to be the list of tasks and this seems to be the case
> > for you as well. Is the list of tasks so important that you need to have
> > it in the log? If not you can simply disable this part of the log
> > altogether by /proc/sys/vm/oom_dump_tasks.
> 
> That could be another way to mitigate the problem, although losing some
> information. I'll try to come up with a better alternative.

Yes, but there is no way around this without losing information. You are
either losing whole reports or a very noisy part of a report. I consider
the first option to be worse because eligible tasks list is useful only
very seldom.

Another aspect of the problem is effectivelly broken rate limiting
because of the way how it is implemented. That is something worth
looking into IMHO.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
  2020-10-13 10:46         ` Tetsuo Handa
@ 2020-10-15 13:05           ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2020-10-15 13:05 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ricardo Cañuelo, Michal Hocko, akpm, kernel, hch, guro,
	rientjes, mcgrof, keescook, yzaikin, linux-mm,
	Sergey Senozhatsky, Steven Rostedt

On Tue 2020-10-13 19:46:32, Tetsuo Handa wrote:
> On 2020/10/13 18:02, Petr Mladek wrote:
> > On Tue 2020-10-13 09:40:27, Tetsuo Handa wrote:
> >> Proper ratelimiting for OOM messages had better not to count on asynchronous printk().
> > 
> > I am a bit confused. AFAIK, you wanted to print OOM messages
> > asynchronous ways in the past. The lockless printk ringbuffer is on
> > its way into 5.10. Handling consoles in kthreads will be the next
> > step of the printk rework.
> 
> What I'm proposing is synchronously printing OOM messages from a different
> thread, for one dump_tasks() call can generate thousands of lines which may
> significantly delay arrival of non OOM related messages to consoles (or even
> drop due to logbuf being full). I don't want to enqueue too many OOM related
> messages to logbuf, even after printk() became completely asynchronous.

This looks like a lot of complexity. I am not convinced that it is
worth it. I could understand that people heavily testing OOM behavior
meet this problem. But I wonder how many people really meet this
problem in the real life.

> > Could you please provide some examples how you would tune ratelimit
> > when printing all messages to the console takes X ms and OOM
> > happens every Y ms?
> 
> My proposal is to decide whether to print the new report based on
> whether all OOM candidates for that OOM domain have been flushed to
> consoles. There is no X and Y.

From the printk() point of view, we need an API that would provide
information whether a given message reached the consoles or not.
Then it would be up to the MM-code to use it.

One catch might be when console_seq is synchronized by console_lock.
Because the caller of this API would become responsible for flushing
all existing messages. But it should be usable. It is currently
synchronized also by logbuf_lock.

Best Regards,
Petr


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-15 13:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09  9:30 [PATCH] mm, oom: enable rate-limiting controls for oom dumps Ricardo Cañuelo
2020-10-12 15:18 ` Michal Hocko
2020-10-13  9:23   ` Ricardo Cañuelo
2020-10-13 11:56     ` Michal Hocko
2020-10-12 15:22 ` Petr Mladek
2020-10-12 15:41   ` Michal Hocko
2020-10-13  0:40     ` Tetsuo Handa
2020-10-13  7:25       ` Michal Hocko
2020-10-13  9:02       ` Petr Mladek
2020-10-13 10:46         ` Tetsuo Handa
2020-10-15 13:05           ` Petr Mladek
2020-10-13  9:18   ` Ricardo Cañuelo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).