linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
@ 2014-09-23  2:57 Hui Zhu
  2014-09-23  4:17 ` Greg KH
  2014-09-24 15:34 ` Rik van Riel
  0 siblings, 2 replies; 8+ messages in thread
From: Hui Zhu @ 2014-09-23  2:57 UTC (permalink / raw)
  To: gregkh, rientjes, vinayakm.list, weijie.yang
  Cc: devel, linux-kernel, teawater, Hui Zhu

The cause of this issue is when free memroy size is low and a lot of task is
trying to shrink the memory, the task that is killed by lowmemkiller cannot get
CPU to exit itself.

Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
is TIF_MEMDIE in lowmemkiller.

Signed-off-by: Hui Zhu <zhuhui@xiaomi.com>
---
 drivers/staging/android/lowmemorykiller.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index b545d3d..ca1ffac 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -129,6 +129,10 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 
 		if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
 		    time_before_eq(jiffies, lowmem_deathpending_timeout)) {
+			struct sched_param param = { .sched_priority = 1 };
+
+			if (p->policy == SCHED_NORMAL)
+				sched_setscheduler(p, SCHED_FIFO, &param);
 			task_unlock(p);
 			rcu_read_unlock();
 			return 0;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-23  2:57 [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task Hui Zhu
@ 2014-09-23  4:17 ` Greg KH
  2014-09-23  4:48   ` 朱辉
  2014-09-24 15:34 ` Rik van Riel
  1 sibling, 1 reply; 8+ messages in thread
From: Greg KH @ 2014-09-23  4:17 UTC (permalink / raw)
  To: Hui Zhu
  Cc: rientjes, vinayakm.list, weijie.yang, devel, linux-kernel, teawater

On Tue, Sep 23, 2014 at 10:57:09AM +0800, Hui Zhu wrote:
> The cause of this issue is when free memroy size is low and a lot of task is
> trying to shrink the memory, the task that is killed by lowmemkiller cannot get
> CPU to exit itself.
> 
> Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
> is TIF_MEMDIE in lowmemkiller.
> 
> Signed-off-by: Hui Zhu <zhuhui@xiaomi.com>
> ---
>  drivers/staging/android/lowmemorykiller.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
> index b545d3d..ca1ffac 100644
> --- a/drivers/staging/android/lowmemorykiller.c
> +++ b/drivers/staging/android/lowmemorykiller.c
> @@ -129,6 +129,10 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
>  
>  		if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
>  		    time_before_eq(jiffies, lowmem_deathpending_timeout)) {
> +			struct sched_param param = { .sched_priority = 1 };
> +
> +			if (p->policy == SCHED_NORMAL)
> +				sched_setscheduler(p, SCHED_FIFO, &param);

This seems really specific to a specific scheduler pattern now.  Isn't
there some other way to resolve this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-23  4:17 ` Greg KH
@ 2014-09-23  4:48   ` 朱辉
  2014-09-23  8:00     ` Weijie Yang
  0 siblings, 1 reply; 8+ messages in thread
From: 朱辉 @ 2014-09-23  4:48 UTC (permalink / raw)
  To: Greg KH
  Cc: rientjes, vinayakm.list, weijie.yang, devel, linux-kernel, teawater

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 1715 bytes --]



On 09/23/14 12:18, Greg KH wrote:
> On Tue, Sep 23, 2014 at 10:57:09AM +0800, Hui Zhu wrote:
>> The cause of this issue is when free memroy size is low and a lot of task is
>> trying to shrink the memory, the task that is killed by lowmemkiller cannot get
>> CPU to exit itself.
>>
>> Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
>> is TIF_MEMDIE in lowmemkiller.
>>
>> Signed-off-by: Hui Zhu <zhuhui@xiaomi.com>
>> ---
>>   drivers/staging/android/lowmemorykiller.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
>> index b545d3d..ca1ffac 100644
>> --- a/drivers/staging/android/lowmemorykiller.c
>> +++ b/drivers/staging/android/lowmemorykiller.c
>> @@ -129,6 +129,10 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
>>
>>   		if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
>>   		    time_before_eq(jiffies, lowmem_deathpending_timeout)) {
>> +			struct sched_param param = { .sched_priority = 1 };
>> +
>> +			if (p->policy == SCHED_NORMAL)
>> +				sched_setscheduler(p, SCHED_FIFO, &param);
>
> This seems really specific to a specific scheduler pattern now.  Isn't
> there some other way to resolve this?

I tried to let the task that call lowmemkiller sleep some time when it 
try to kill same task.  But it doesn't work.
I think the issue is that the free memroy size is too low to make more 
and more tasks come to call lowmemkiller.

Thanks,
Hui

>
> thanks,
>
> greg k-h
>
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-23  4:48   ` 朱辉
@ 2014-09-23  8:00     ` Weijie Yang
  2014-09-23 14:28       ` 朱辉
  0 siblings, 1 reply; 8+ messages in thread
From: Weijie Yang @ 2014-09-23  8:00 UTC (permalink / raw)
  To: 朱辉
  Cc: Greg KH, rientjes, vinayakm.list, weijie.yang, devel,
	linux-kernel, teawater

On Tue, Sep 23, 2014 at 12:48 PM, 朱辉 <zhuhui@xiaomi.com> wrote:
>
>
> On 09/23/14 12:18, Greg KH wrote:
>> On Tue, Sep 23, 2014 at 10:57:09AM +0800, Hui Zhu wrote:
>>> The cause of this issue is when free memroy size is low and a lot of task is
>>> trying to shrink the memory, the task that is killed by lowmemkiller cannot get
>>> CPU to exit itself.
>>>
>>> Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
>>> is TIF_MEMDIE in lowmemkiller.
>>>
>>> Signed-off-by: Hui Zhu <zhuhui@xiaomi.com>
>>> ---
>>>   drivers/staging/android/lowmemorykiller.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
>>> index b545d3d..ca1ffac 100644
>>> --- a/drivers/staging/android/lowmemorykiller.c
>>> +++ b/drivers/staging/android/lowmemorykiller.c
>>> @@ -129,6 +129,10 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
>>>
>>>              if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
>>>                  time_before_eq(jiffies, lowmem_deathpending_timeout)) {
>>> +                    struct sched_param param = { .sched_priority = 1 };
>>> +
>>> +                    if (p->policy == SCHED_NORMAL)
>>> +                            sched_setscheduler(p, SCHED_FIFO, &param);
>>
>> This seems really specific to a specific scheduler pattern now.  Isn't
>> there some other way to resolve this?

hui, how about modify lowmem_deathpending_timeout if we don't
touch scheduler pattern?

> I tried to let the task that call lowmemkiller sleep some time when it
> try to kill same task.  But it doesn't work.
> I think the issue is that the free memroy size is too low to make more
> and more tasks come to call lowmemkiller.

I am not opposed to the idea that the task which is selected to be killed
should exit ASAP.

I want to make it clear, what is problem for the existing code and which
effect we can get by applying this patch.
1. LMK count is increased, which can be reduced by applying this patch?
2. app become more sluggish?

By the way, whether we need to modify out_of_memory() which also
try to kill task?

> Thanks,
> Hui
>
>>
>> thanks,
>>
>> greg k-h
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-23  8:00     ` Weijie Yang
@ 2014-09-23 14:28       ` 朱辉
  2014-09-23 22:24         ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: 朱辉 @ 2014-09-23 14:28 UTC (permalink / raw)
  To: Weijie Yang
  Cc: Greg KH, rientjes, vinayakm.list, weijie.yang, devel,
	linux-kernel, teawater

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 3000 bytes --]

On 09/23/14 16:00, Weijie Yang wrote:
> On Tue, Sep 23, 2014 at 12:48 PM, Öì»Ô <zhuhui@xiaomi.com> wrote:
>>
>>
>> On 09/23/14 12:18, Greg KH wrote:
>>> On Tue, Sep 23, 2014 at 10:57:09AM +0800, Hui Zhu wrote:
>>>> The cause of this issue is when free memroy size is low and a lot of task is
>>>> trying to shrink the memory, the task that is killed by lowmemkiller cannot get
>>>> CPU to exit itself.
>>>>
>>>> Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
>>>> is TIF_MEMDIE in lowmemkiller.
>>>>
>>>> Signed-off-by: Hui Zhu <zhuhui@xiaomi.com>
>>>> ---
>>>>    drivers/staging/android/lowmemorykiller.c | 4 ++++
>>>>    1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
>>>> index b545d3d..ca1ffac 100644
>>>> --- a/drivers/staging/android/lowmemorykiller.c
>>>> +++ b/drivers/staging/android/lowmemorykiller.c
>>>> @@ -129,6 +129,10 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
>>>>
>>>>               if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
>>>>                   time_before_eq(jiffies, lowmem_deathpending_timeout)) {
>>>> +                    struct sched_param param = { .sched_priority = 1 };
>>>> +
>>>> +                    if (p->policy == SCHED_NORMAL)
>>>> +                            sched_setscheduler(p, SCHED_FIFO, &param);
>>>
>>> This seems really specific to a specific scheduler pattern now.  Isn't
>>> there some other way to resolve this?
>
> hui, how about modify lowmem_deathpending_timeout if we don't
> touch scheduler pattern?

I tried to change line "lowmem_deathpending_timeout = jiffies + HZ" to 
"lowmem_deathpending_timeout = jiffies + HZ * 10".
But the issue can be reproduced sometimes.
Could you give me some comments on this part?

>
>> I tried to let the task that call lowmemkiller sleep some time when it
>> try to kill same task.  But it doesn't work.
>> I think the issue is that the free memroy size is too low to make more
>> and more tasks come to call lowmemkiller.
>
> I am not opposed to the idea that the task which is selected to be killed
> should exit ASAP.
>
> I want to make it clear, what is problem for the existing code and which
> effect we can get by applying this patch.
> 1. LMK count is increased, which can be reduced by applying this patch?
I think the free mem size will grow faster than without this patch.
> 2. app become more sluggish?
I didn't get that in my part.

>
> By the way, whether we need to modify out_of_memory() which also
> try to kill task?

I am not sure because LMK handle the memory issue early than OOM.
But I think this issue will not affect OOM because OOM has oom_zonelist_trylock and oom_zonelist_unlock.

Thanks,
Hui

>
>> Thanks,
>> Hui
>>
>>>
>>> thanks,
>>>
>>> greg k-h
>>>
>ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-23 14:28       ` 朱辉
@ 2014-09-23 22:24         ` David Rientjes
  0 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2014-09-23 22:24 UTC (permalink / raw)
  To: 朱辉
  Cc: Weijie Yang, Greg KH, vinayakm.list, weijie.yang, devel,
	linux-kernel, teawater

[-- Attachment #1: Type: TEXT/PLAIN, Size: 554 bytes --]

On Tue, 23 Sep 2014, 朱辉 wrote:

> > By the way, whether we need to modify out_of_memory() which also
> > try to kill task?
> 
> I am not sure because LMK handle the memory issue early than OOM.
> But I think this issue will not affect OOM because OOM has oom_zonelist_trylock and oom_zonelist_unlock.
> 

The oom killer tries to schedule other processes that exit the oom killer 
without being a victim itself and any process that cannot get the zonelist 
lock will do the same in the page allocator.  The low memory killer case 
is somewhat different.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-23  2:57 [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task Hui Zhu
  2014-09-23  4:17 ` Greg KH
@ 2014-09-24 15:34 ` Rik van Riel
  2014-10-14  6:51   ` 朱辉
  1 sibling, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2014-09-24 15:34 UTC (permalink / raw)
  To: Hui Zhu, gregkh, rientjes, vinayakm.list, weijie.yang
  Cc: devel, linux-kernel, teawater

On 09/22/2014 10:57 PM, Hui Zhu wrote:
> The cause of this issue is when free memroy size is low and a lot of task is
> trying to shrink the memory, the task that is killed by lowmemkiller cannot get
> CPU to exit itself.
> 
> Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
> is TIF_MEMDIE in lowmemkiller.

Is it actually true that the task that was killed by lowmemkiller
cannot get CPU time?

It is also possible that the task is busy in the kernel, for example
in the reclaim code, and is not breaking out of some loop fast enough,
despite the TIF_MEMDIE flag being set.

I suspect SCHED_FIFO simply papers over that kind of issue, by not
letting anything else run until the task is gone, instead of fixing
the root cause of the problem.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task
  2014-09-24 15:34 ` Rik van Riel
@ 2014-10-14  6:51   ` 朱辉
  0 siblings, 0 replies; 8+ messages in thread
From: 朱辉 @ 2014-10-14  6:51 UTC (permalink / raw)
  To: Rik van Riel, 朱辉,
	gregkh, rientjes, vinayakm.list, weijie.yang
  Cc: devel, linux-kernel, teawater

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 2057 bytes --]

2014 09 24 23:36, Rik van Riel:
> On 09/22/2014 10:57 PM, Hui Zhu wrote:
>> The cause of this issue is when free memroy size is low and a lot of task is
>> trying to shrink the memory, the task that is killed by lowmemkiller cannot get
>> CPU to exit itself.
>>
>> Fix this issue with change the scheduling policy to SCHED_FIFO if a task's flag
>> is TIF_MEMDIE in lowmemkiller.
>
> Is it actually true that the task that was killed by lowmemkiller
> cannot get CPU time?

I am so sorry that answer this mail late because I tried to do more test 
around it.
But this issue is really hard to reproduce the issue.  I got a special 
app that can reproduce this issue easyly. But I still need retry a lot 
of times to repdroduce this issue.

And I found that most of time, the task cannot be killed because it is 
blocked by binder_lock.
It looks like there are something wrong with a task that get binder_lock 
and it is blocked by another thing.

So I make a patch that change a binder_lock to binder_lock_killable to 
handle this issue.(I will post it later)
It work sometime but I am not sure it is right.
And I just met one time, the kernel with the binder patch and without 
the lowmemkiller SCHED_FIFO patch, a task that didn't blocked by a lock. 
  And different tasks call lowmemkiller tried to kill this task.
I think the root cause of this issue is killed task cannot get cpu.
But I just got this issue one time.

>
> It is also possible that the task is busy in the kernel, for example
> in the reclaim code, and is not breaking out of some loop fast enough,
> despite the TIF_MEMDIE flag being set.
>
> I suspect SCHED_FIFO simply papers over that kind of issue, by not
> letting anything else run until the task is gone, instead of fixing
> the root cause of the problem.
>
>

According to I introduction, I think lowmemkiller SCHED_FIFO patch maybe 
can handle some issue.

Thanks,
Hui
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-14  7:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-23  2:57 [PATCH] Fix the issue that lowmemkiller fell into a cycle that try to kill a task Hui Zhu
2014-09-23  4:17 ` Greg KH
2014-09-23  4:48   ` 朱辉
2014-09-23  8:00     ` Weijie Yang
2014-09-23 14:28       ` 朱辉
2014-09-23 22:24         ` David Rientjes
2014-09-24 15:34 ` Rik van Riel
2014-10-14  6:51   ` 朱辉

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).