linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
@ 2018-04-04 15:53 Steven Rostedt
  2018-04-04 16:00 ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2018-04-04 15:53 UTC (permalink / raw)
  To: LKML
  Cc: Michal Hocko, Zhaoyang Huang, Ingo Molnar, linux-kernel,
	kernel-patch-test, Andrew Morton, Joel Fernandes, linux-mm,
	Vlastimil Babka

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

As si_mem_available() can say there is enough memory even though the memory
available is not useable by the ring buffer, it is best to not kill innocent
applications because the ring buffer is taking up all the memory while it is
trying to allocate a great deal of memory.

If the allocator is user space (because kernel threads can also increase the
size of the kernel ring buffer on boot up), then after si_mem_available()
says there is enough memory, set the OOM killer to kill the current task if
an OOM triggers during the allocation.

Link: http://lkml.kernel.org/r/20180404062340.GD6312@dhcp22.suse.cz

Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c | 48 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 966128f02121..c9cb9767d49b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -22,6 +22,7 @@
 #include <linux/hash.h>
 #include <linux/list.h>
 #include <linux/cpu.h>
+#include <linux/oom.h>
 
 #include <asm/local.h>
 
@@ -1162,35 +1163,60 @@ static int rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
 static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu)
 {
 	struct buffer_page *bpage, *tmp;
+	bool user_thread = current->mm != NULL;
+	gfp_t mflags;
 	long i;
 
-	/* Check if the available memory is there first */
+	/*
+	 * Check if the available memory is there first.
+	 * Note, si_mem_available() only gives us a rough estimate of available
+	 * memory. It may not be accurate. But we don't care, we just want
+	 * to prevent doing any allocation when it is obvious that it is
+	 * not going to succeed.
+	 */
 	i = si_mem_available();
 	if (i < nr_pages)
 		return -ENOMEM;
 
+	/*
+	 * __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
+	 * gracefully without invoking oom-killer and the system is not
+	 * destabilized.
+	 */
+	mflags = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
+
+	/*
+	 * If a user thread allocates too much, and si_mem_available()
+	 * reports there's enough memory, even though there is not.
+	 * Make sure the OOM killer kills this thread. This can happen
+	 * even with RETRY_MAYFAIL because another task may be doing
+	 * an allocation after this task has taken all memory.
+	 * This is the task the OOM killer needs to take out during this
+	 * loop, even if it was triggered by an allocation somewhere else.
+	 */
+	if (user_thread)
+		set_current_oom_origin();
 	for (i = 0; i < nr_pages; i++) {
 		struct page *page;
-		/*
-		 * __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
-		 * gracefully without invoking oom-killer and the system is not
-		 * destabilized.
-		 */
+
 		bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()),
-				    GFP_KERNEL | __GFP_RETRY_MAYFAIL,
-				    cpu_to_node(cpu));
+				    mflags, cpu_to_node(cpu));
 		if (!bpage)
 			goto free_pages;
 
 		list_add(&bpage->list, pages);
 
-		page = alloc_pages_node(cpu_to_node(cpu),
-					GFP_KERNEL | __GFP_RETRY_MAYFAIL, 0);
+		page = alloc_pages_node(cpu_to_node(cpu), mflags, 0);
 		if (!page)
 			goto free_pages;
 		bpage->page = page_address(page);
 		rb_init_page(bpage->page);
+
+		if (user_thread && fatal_signal_pending(current))
+			goto free_pages;
 	}
+	if (user_thread)
+		clear_current_oom_origin();
 
 	return 0;
 
@@ -1199,6 +1225,8 @@ static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu)
 		list_del_init(&bpage->list);
 		free_buffer_page(bpage);
 	}
+	if (user_thread)
+		clear_current_oom_origin();
 
 	return -ENOMEM;
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 15:53 [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations Steven Rostedt
@ 2018-04-04 16:00 ` Steven Rostedt
  2018-04-04 16:03   ` Joel Fernandes
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2018-04-04 16:00 UTC (permalink / raw)
  To: LKML
  Cc: Michal Hocko, Zhaoyang Huang, Ingo Molnar, kernel-patch-test,
	Andrew Morton, Joel Fernandes, linux-mm, Vlastimil Babka

On Wed, 4 Apr 2018 11:53:10 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> @@ -1162,35 +1163,60 @@ static int rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
>  static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu)
>  {
>  	struct buffer_page *bpage, *tmp;
> +	bool user_thread = current->mm != NULL;
> +	gfp_t mflags;
>  	long i;
>  
> -	/* Check if the available memory is there first */
> +	/*
> +	 * Check if the available memory is there first.
> +	 * Note, si_mem_available() only gives us a rough estimate of available
> +	 * memory. It may not be accurate. But we don't care, we just want
> +	 * to prevent doing any allocation when it is obvious that it is
> +	 * not going to succeed.
> +	 */

In case you are wondering how I tested this, I simply added:

#if 0
>  	i = si_mem_available();
>  	if (i < nr_pages)
>  		return -ENOMEM;
#endif

for the tests. Note, without this, I tried to allocate all memory
(bisecting it with allocations that failed and allocations that
succeeded), and couldn't trigger an OOM :-/

Of course, this was on x86_64, where I'm sure I could allocate
any memory, and probably would have had more luck with a 32bit kernel
using higmem.

-- Steve

>  
> +	/*
> +	 * __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
> +	 * gracefully without invoking oom-killer and the system is not
> +	 * destabilized.
> +	 */
> +	mflags = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
> +
> +	/*
> +	 * If a user thread allocates too much, and si_mem_available()
> +	 * reports there's enough memory, even though there is not.
> +	 * Make sure the OOM killer kills this thread. This can happen
> +	 * even with RETRY_MAYFAIL because another task may be doing
> +	 * an allocation after this task has taken all memory.
> +	 * This is the task the OOM killer needs to take out during this
> +	 * loop, even if it was triggered by an allocation somewhere else.
> +	 */
> +	if (user_thread)
> +		set_current_oom_origin();
>  	for (i = 0; i < nr_pages; i++) {
>  		struct page *page;
> -		/*
> -		 * __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
> -		 * gracefully without invoking oom-killer and the system is not
> -		 * destabilized.
> -		 */
> +
>  		bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()),
> -				    GFP_KERNEL | __GFP_RETRY_MAYFAIL,
> -				    cpu_to_node(cpu));
> +				    mflags, cpu_to_node(cpu));
>  		if (!bpage)
>  			goto free_pages;
>  
>  		list_add(&bpage->list, pages);
>  
> -		page = alloc_pages_node(cpu_to_node(cpu),
> -					GFP_KERNEL | __GFP_RETRY_MAYFAIL, 0);
> +		page = alloc_pages_node(cpu_to_node(cpu), mflags, 0);
>  		if (!page)
>  			goto free_pages;
>  		bpage->page = page_address(page);
>  		rb_init_page(bpage->page);
> +
> +		if (user_thread && fatal_signal_pending(current))
> +			goto free_pages;
>  	}
> +	if (user_thread)
> +		clear_current_oom_origin();
>  
>  	return 0;
>  
> @@ -1199,6 +1225,8 @@ static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu)
>  		list_del_init(&bpage->list);
>  		free_buffer_page(bpage);
>  	}
> +	if (user_thread)
> +		clear_current_oom_origin();
>  
>  	return -ENOMEM;
>  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 16:00 ` Steven Rostedt
@ 2018-04-04 16:03   ` Joel Fernandes
  2018-04-04 16:13     ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2018-04-04 16:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Wed, Apr 4, 2018 at 9:00 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 4 Apr 2018 11:53:10 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> @@ -1162,35 +1163,60 @@ static int rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
>>  static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu)
>>  {
>>       struct buffer_page *bpage, *tmp;
>> +     bool user_thread = current->mm != NULL;
>> +     gfp_t mflags;
>>       long i;
>>
>> -     /* Check if the available memory is there first */
>> +     /*
>> +      * Check if the available memory is there first.
>> +      * Note, si_mem_available() only gives us a rough estimate of available
>> +      * memory. It may not be accurate. But we don't care, we just want
>> +      * to prevent doing any allocation when it is obvious that it is
>> +      * not going to succeed.
>> +      */
>
> In case you are wondering how I tested this, I simply added:
>
> #if 0
>>       i = si_mem_available();
>>       if (i < nr_pages)
>>               return -ENOMEM;
> #endif
>
> for the tests. Note, without this, I tried to allocate all memory
> (bisecting it with allocations that failed and allocations that
> succeeded), and couldn't trigger an OOM :-/

I guess you need to have something *else* other than the write to
buffer_size_kb doing the GFP_KERNEL allocations but unfortunately gets
OOM killed?

Also, I agree with the new patch and its nice idea to do that.

thanks,

- Joel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 16:03   ` Joel Fernandes
@ 2018-04-04 16:13     ` Steven Rostedt
  2018-04-04 16:18       ` Joel Fernandes
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2018-04-04 16:13 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Wed, 4 Apr 2018 09:03:47 -0700
Joel Fernandes <joelaf@google.com> wrote:


> > for the tests. Note, without this, I tried to allocate all memory
> > (bisecting it with allocations that failed and allocations that
> > succeeded), and couldn't trigger an OOM :-/  
> 
> I guess you need to have something *else* other than the write to
> buffer_size_kb doing the GFP_KERNEL allocations but unfortunately gets
> OOM killed?

Yeah, for some reason, my test box seems to always have something doing
that, because I trigger an OOM about 2 out of every 3 tries.

Here's the tasks that trigger it:

 lvmetad, crond, systemd-journal, abrt-dump-journ, chronyd,

I guess my system is rather busy even when idle :-/

> 
> Also, I agree with the new patch and its nice idea to do that.

Thanks, want to give it a test too?

-- Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 16:13     ` Steven Rostedt
@ 2018-04-04 16:18       ` Joel Fernandes
  2018-04-04 23:59         ` Joel Fernandes
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2018-04-04 16:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Wed, Apr 4, 2018 at 9:13 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
[..]
>>
>> Also, I agree with the new patch and its nice idea to do that.
>
> Thanks, want to give it a test too?

Sure, I'll try it in a few hours. I am thinking of trying some of the
memory pressure tools we have. Likely by noon or so once I look at
some day job things :-D

thanks,

- Joel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 16:18       ` Joel Fernandes
@ 2018-04-04 23:59         ` Joel Fernandes
  2018-04-05 13:43           ` Steven Rostedt
  2018-04-05 14:51           ` Michal Hocko
  0 siblings, 2 replies; 12+ messages in thread
From: Joel Fernandes @ 2018-04-04 23:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

Hi Steve,

On Wed, Apr 4, 2018 at 9:18 AM, Joel Fernandes <joelaf@google.com> wrote:
> On Wed, Apr 4, 2018 at 9:13 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> [..]
>>>
>>> Also, I agree with the new patch and its nice idea to do that.
>>
>> Thanks, want to give it a test too?

With the latest tree and the below diff, I can still OOM-kill a victim
process doing a large buffer_size_kb write:

I pulled your ftrace/core and added this:
+       /*
        i = si_mem_available();
        if (i < nr_pages)
                return -ENOMEM;
+       */

Here's a run in Qemu with 4-cores 1GB total memory:

bash-4.3# ./m -m 1M &
[1] 1056
bash-4.3#
bash-4.3#
bash-4.3#
bash-4.3# echo 10000000 > /d/tracing/buffer_size_kb
[   33.213988] Out of memory: Kill process 1042 (bash) score
1712050900 or sacrifice child
[   33.215349] Killed process 1056 (m) total-vm:9220kB,
anon-rss:7564kB, file-rss:4kB, shmem-rss:640kB
bash: echo: write error: Cannot allocate memory
[1]+  Killed                  ./m -m 1M
bash-4.3#
--

As you can see, OOM killer triggers and kills "m" which is my busy
memory allocator (it allocates and frees lots of memory and does that
in a loop)

Here's the m program, sorry if it looks too ugly:
https://pastebin.com/raw/aG6Qw37Z

Happy to try anything else, BTW when the si_mem_available check
enabled, this doesn't happen and the buffer_size_kb write fails
normally without hurting anything else.

- Joel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 23:59         ` Joel Fernandes
@ 2018-04-05 13:43           ` Steven Rostedt
  2018-04-05 19:57             ` Joel Fernandes
  2018-04-05 14:51           ` Michal Hocko
  1 sibling, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2018-04-05 13:43 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Wed, 4 Apr 2018 16:59:18 -0700
Joel Fernandes <joelaf@google.com> wrote:

> Happy to try anything else, BTW when the si_mem_available check
> enabled, this doesn't happen and the buffer_size_kb write fails
> normally without hurting anything else.

Can you remove the RETRY_MAYFAIL and see if you can try again? It may
be that we just remove that, and if si_mem_available() is wrong, it
will kill the process :-/ My original code would only add MAYFAIL if it
was a kernel thread (which is why I created the mflags variable).

-- Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-04 23:59         ` Joel Fernandes
  2018-04-05 13:43           ` Steven Rostedt
@ 2018-04-05 14:51           ` Michal Hocko
  2018-04-05 19:47             ` Joel Fernandes
  1 sibling, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2018-04-05 14:51 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Steven Rostedt, LKML, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Wed 04-04-18 16:59:18, Joel Fernandes wrote:
> Hi Steve,
> 
> On Wed, Apr 4, 2018 at 9:18 AM, Joel Fernandes <joelaf@google.com> wrote:
> > On Wed, Apr 4, 2018 at 9:13 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > [..]
> >>>
> >>> Also, I agree with the new patch and its nice idea to do that.
> >>
> >> Thanks, want to give it a test too?
> 
> With the latest tree and the below diff, I can still OOM-kill a victim
> process doing a large buffer_size_kb write:
> 
> I pulled your ftrace/core and added this:
> +       /*
>         i = si_mem_available();
>         if (i < nr_pages)
>                 return -ENOMEM;
> +       */
> 
> Here's a run in Qemu with 4-cores 1GB total memory:
> 
> bash-4.3# ./m -m 1M &
> [1] 1056
> bash-4.3#
> bash-4.3#
> bash-4.3#
> bash-4.3# echo 10000000 > /d/tracing/buffer_size_kb
> [   33.213988] Out of memory: Kill process 1042 (bash) score
> 1712050900 or sacrifice child
> [   33.215349] Killed process 1056 (m) total-vm:9220kB,
> anon-rss:7564kB, file-rss:4kB, shmem-rss:640kB

OK, so the reason your memory hog is triggered is that your echo is
built-in and we properly select bask as an oom_origin but then another
clever heuristic jumps in and tries to reduce the damage by sacrificing
a child process. And your memory hog runs as a child from the same bash
session.

I cannot say I would love this heuristic. In fact I would really love to
dig it deep under the ground. But this is a harder sell than it might
seem. Anyway is your testing scenario really representative enough to
care? Does the buffer_size_kb updater runs in the same process as any
large memory process?

> bash: echo: write error: Cannot allocate memory
> [1]+  Killed                  ./m -m 1M
> bash-4.3#
> --

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-05 14:51           ` Michal Hocko
@ 2018-04-05 19:47             ` Joel Fernandes
  0 siblings, 0 replies; 12+ messages in thread
From: Joel Fernandes @ 2018-04-05 19:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Steven Rostedt, LKML, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Thu, Apr 5, 2018 at 7:51 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 04-04-18 16:59:18, Joel Fernandes wrote:
>> Hi Steve,
>>
>> On Wed, Apr 4, 2018 at 9:18 AM, Joel Fernandes <joelaf@google.com> wrote:
>> > On Wed, Apr 4, 2018 at 9:13 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
>> > [..]
>> >>>
>> >>> Also, I agree with the new patch and its nice idea to do that.
>> >>
>> >> Thanks, want to give it a test too?
>>
>> With the latest tree and the below diff, I can still OOM-kill a victim
>> process doing a large buffer_size_kb write:
>>
>> I pulled your ftrace/core and added this:
>> +       /*
>>         i = si_mem_available();
>>         if (i < nr_pages)
>>                 return -ENOMEM;
>> +       */
>>
>> Here's a run in Qemu with 4-cores 1GB total memory:
>>
>> bash-4.3# ./m -m 1M &
>> [1] 1056
>> bash-4.3#
>> bash-4.3#
>> bash-4.3#
>> bash-4.3# echo 10000000 > /d/tracing/buffer_size_kb
>> [   33.213988] Out of memory: Kill process 1042 (bash) score
>> 1712050900 or sacrifice child
>> [   33.215349] Killed process 1056 (m) total-vm:9220kB,
>> anon-rss:7564kB, file-rss:4kB, shmem-rss:640kB
>
> OK, so the reason your memory hog is triggered is that your echo is
> built-in and we properly select bask as an oom_origin but then another
> clever heuristic jumps in and tries to reduce the damage by sacrificing
> a child process. And your memory hog runs as a child from the same bash
> session.

Oh, ok. Makes sense.

>
> I cannot say I would love this heuristic. In fact I would really love to
> dig it deep under the ground. But this is a harder sell than it might
> seem. Anyway is your testing scenario really representative enough to

No honestly I don't care much for this heuristic but was just helping
test it. The scenario is not something I care about, but it seems like
if I hit it then others users will too. Maybe Zhaoyang can try his use
case again with ftrace/core and si_mem_available commented?

IOW I was just helping test the new patch with the si_mem_available
check commented out.

> care? Does the buffer_size_kb updater runs in the same process as any
> large memory process?

In this Qemu run its just the cat process. At work I use trace-cmd or
atrace neither of which I believe have large memory footprints (AFAIK)

Thanks,

- Joel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-05 13:43           ` Steven Rostedt
@ 2018-04-05 19:57             ` Joel Fernandes
  2018-04-05 23:36               ` Joel Fernandes
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2018-04-05 19:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

Hi Steve,

On Thu, Apr 5, 2018 at 6:43 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 4 Apr 2018 16:59:18 -0700
> Joel Fernandes <joelaf@google.com> wrote:
>
>> Happy to try anything else, BTW when the si_mem_available check
>> enabled, this doesn't happen and the buffer_size_kb write fails
>> normally without hurting anything else.
>
> Can you remove the RETRY_MAYFAIL and see if you can try again? It may
> be that we just remove that, and if si_mem_available() is wrong, it
> will kill the process :-/ My original code would only add MAYFAIL if it
> was a kernel thread (which is why I created the mflags variable).

Tried this. Dropping RETRY_MAYFAIL and the si_mem_available check
destabilized the system and brought it down (along with OOM killing
the victim).

System hung for several seconds and then both the memory hog and bash
got killed.

thanks,

- Joel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-05 19:57             ` Joel Fernandes
@ 2018-04-05 23:36               ` Joel Fernandes
  2018-04-06  7:16                 ` Zhaoyang Huang
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2018-04-05 23:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Michal Hocko, Zhaoyang Huang, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

Hi Steve,

On Thu, Apr 5, 2018 at 12:57 PM, Joel Fernandes <joelaf@google.com> wrote:
> On Thu, Apr 5, 2018 at 6:43 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
>> On Wed, 4 Apr 2018 16:59:18 -0700
>> Joel Fernandes <joelaf@google.com> wrote:
>>
>>> Happy to try anything else, BTW when the si_mem_available check
>>> enabled, this doesn't happen and the buffer_size_kb write fails
>>> normally without hurting anything else.
>>
>> Can you remove the RETRY_MAYFAIL and see if you can try again? It may
>> be that we just remove that, and if si_mem_available() is wrong, it
>> will kill the process :-/ My original code would only add MAYFAIL if it
>> was a kernel thread (which is why I created the mflags variable).
>
> Tried this. Dropping RETRY_MAYFAIL and the si_mem_available check
> destabilized the system and brought it down (along with OOM killing
> the victim).
>
> System hung for several seconds and then both the memory hog and bash
> got killed.

I think its still Ok to keep the OOM patch as a safe guard even though
its hard to test, and the si_mem_available on its own seem sufficient.
What do you think?

thanks,


- Joel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations
  2018-04-05 23:36               ` Joel Fernandes
@ 2018-04-06  7:16                 ` Zhaoyang Huang
  0 siblings, 0 replies; 12+ messages in thread
From: Zhaoyang Huang @ 2018-04-06  7:16 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Steven Rostedt, LKML, Michal Hocko, Ingo Molnar,
	kernel-patch-test, Andrew Morton, open list:MEMORY MANAGEMENT,
	Vlastimil Babka

On Fri, Apr 6, 2018 at 7:36 AM, Joel Fernandes <joelaf@google.com> wrote:
> Hi Steve,
>
> On Thu, Apr 5, 2018 at 12:57 PM, Joel Fernandes <joelaf@google.com> wrote:
>> On Thu, Apr 5, 2018 at 6:43 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
>>> On Wed, 4 Apr 2018 16:59:18 -0700
>>> Joel Fernandes <joelaf@google.com> wrote:
>>>
>>>> Happy to try anything else, BTW when the si_mem_available check
>>>> enabled, this doesn't happen and the buffer_size_kb write fails
>>>> normally without hurting anything else.
>>>
>>> Can you remove the RETRY_MAYFAIL and see if you can try again? It may
>>> be that we just remove that, and if si_mem_available() is wrong, it
>>> will kill the process :-/ My original code would only add MAYFAIL if it
>>> was a kernel thread (which is why I created the mflags variable).
>>
>> Tried this. Dropping RETRY_MAYFAIL and the si_mem_available check
>> destabilized the system and brought it down (along with OOM killing
>> the victim).
>>
>> System hung for several seconds and then both the memory hog and bash
>> got killed.
>
> I think its still Ok to keep the OOM patch as a safe guard even though
> its hard to test, and the si_mem_available on its own seem sufficient.
> What do you think?
>
> thanks,
>
>
> - Joel
I also test the patch on my system, which works fine for the previous script.

PS: The script I mentioned is the cts test case POC 16_12 on android8.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-04-06  7:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04 15:53 [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations Steven Rostedt
2018-04-04 16:00 ` Steven Rostedt
2018-04-04 16:03   ` Joel Fernandes
2018-04-04 16:13     ` Steven Rostedt
2018-04-04 16:18       ` Joel Fernandes
2018-04-04 23:59         ` Joel Fernandes
2018-04-05 13:43           ` Steven Rostedt
2018-04-05 19:57             ` Joel Fernandes
2018-04-05 23:36               ` Joel Fernandes
2018-04-06  7:16                 ` Zhaoyang Huang
2018-04-05 14:51           ` Michal Hocko
2018-04-05 19:47             ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).