All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ying Han <yinghan@google.com>
To: Minchan Kim <minchan@kernel.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>, Mel Gorman <mel@csn.ul.ie>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Hugh Dickins <hughd@google.com>, Nick Piggin <npiggin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH] do_try_to_free_pages() might enter infinite loop
Date: Mon, 23 Apr 2012 19:06:22 -0700	[thread overview]
Message-ID: <CALWz4iy0af11bSrgZdeCrXFt7ZauhpPAQkLj9D0v0hrANRXGug@mail.gmail.com> (raw)
In-Reply-To: <4F960257.9090509@kernel.org>

On Mon, Apr 23, 2012 at 6:31 PM, Minchan Kim <minchan@kernel.org> wrote:
> Hi Ying,
>
> On 04/24/2012 08:18 AM, Ying Han wrote:
>
>> On Mon, Apr 23, 2012 at 3:20 PM, KOSAKI Motohiro
>> <kosaki.motohiro@jp.fujitsu.com> wrote:
>>> On Mon, Apr 23, 2012 at 4:56 PM, Ying Han <yinghan@google.com> wrote:
>>>> This is not a patch targeted to be merged at all, but trying to understand
>>>> a logic in global direct reclaim.
>>>>
>>>> There is a logic in global direct reclaim where reclaim fails on priority 0
>>>> and zone->all_unreclaimable is not set, it will cause the direct to start over
>>>> from DEF_PRIORITY. In some extreme cases, we've seen the system hang which is
>>>> very likely caused by direct reclaim enters infinite loop.
>>>>
>>>> There have been serious patches trying to fix similar issue and the latest
>>>> patch has good summary of all the efforts:
>>>>
>>>> commit 929bea7c714220fc76ce3f75bef9056477c28e74
>>>> Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>> Date:   Thu Apr 14 15:22:12 2011 -0700
>>>>
>>>>    vmscan: all_unreclaimable() use zone->all_unreclaimable as a name
>>>>
>>>> Kosaki explained the problem triggered by async zone->all_unreclaimable and
>>>> zone->pages_scanned where the later one was being checked by direct reclaim.
>>>> However, after the patch, the problem remains where the setting of
>>>> zone->all_unreclaimable is asynchronous with zone is actually reclaimable or not.
>>>>
>>>> The zone->all_unreclaimable flag is set by kswapd by checking zone->pages_scanned in
>>>> zone_reclaimable(). Is that possible to have zone->all_unreclaimable == false while
>>>> the zone is actually unreclaimable?
>>>>
>>>> 1. while kswapd in reclaim priority loop, someone frees a page on the zone. It
>>>> will end up resetting the pages_scanned.
>>>>
>>>> 2. kswapd is frozen for whatever reason. I noticed Kosaki's covered the
>>>> hibernation case by checking oom_killer_disabled, but not sure if that is
>>>> everything we need to worry about. The key point here is that direct reclaim
>>>> relies on a flag which is set by kswapd asynchronously, that doesn't sound safe.
>>>
>>> If kswapd was frozen except hibernation, why don't you add frozen
>>> check instead of
>>> hibernation check? And when and why is that happen?
>>
>> I haven't tried to reproduce the issue, so everything is based on
>> eye-balling the code. The problem is that we have the potential
>> infinite loop in direct reclaim where it keeps trying as long as
>> !zone->all_unreclaimable.
>>
>> The flag is only set by kswapd and it will skip setting the flag if
>> the following condition is true:
>>
>> zone->pages_scanned < zone_reclaimable_pages(zone) * 6;
>>
>> In a few-pages-on-lru condition, the zone->pages_scanned is easily
>> remains 0 and also it is reset to 0 everytime a page being freed.
>> Then, i will cause global direct reclaim entering infinite loop.
>>
>
>
> how does zone->pages_scanned become 0 easily in global reclaim?
> Once VM has pages in LRU, it wouldn't be a zero. Look at isolate_lru_pages.
> The problem is get_scan_count which could prevent scanning of LRU list but
> it works well now. If the priority isn't zero and there are few pages in LRU,
> it could be a zero scan but when the priority drop at zero, it could let VM scan
> less pages under SWAP_CLUSTER_MAX. So pages_scanned would be increased.

Yes, that is true. But the pages_scanned will be reset on freeing a
page and that could happen asynchronously. For example I have only 2
pages on file_lru (w/o swap), and here is what is supposed to happen:

A
       kswapd                                   B

direct reclaim

        priority DEP_PRIORITY to 0

        zone->pages_scanned = 3

        zone_reclaimable() == true

        zone->all_unreclaimable == 0

nr_reclaimed == 0 & !zone->all_unreclaimable
retry

         priority DEP_PRIORITY to 0

         zone->pages_scanned = 6

         zone_reclaimable() == true

         zone->all_unreclaimable == 0
nr_reclaimed == 0 & !zone->all_unreclaimable
retry

        repeat the above which eventually

        zone->pages_scanned will grow

        zone->pages_scanned to 12

        zone_reclaimable() == false

        zone->all_unreclaimable == 1
nr_reclaimed == 0 & zone->all_unreclaimable
oom

However, what if B frees a pages everytime before pages_scanned
reaches the point, then we won't set zone->all_unreclaimable at all.
If so, we reaches a livelock here...

>
> I think the problem is live-lock as follows,
>
>
>    A                   kswapd                          B
>
> direct reclaim
> reclaim a page
>                        pages_scanned check <- skip
>
>                                                        steal a page reclaimed by A
>                                                        use the page for user memory.
> alloc failed
> retry
>
> In this scenario, process A would be a live-locked.
> Does it make sense for infinite loop case you mentioned?

Maybe but need to verify. The problem is that we can not distinguish
this case from the case I listed above by seeing
do_try_to_free_pages() always return 1. AFAIK, we do see
zone->pages_scanned == 0 on some of the cases after instrumenting the
kernel.

Overall, having the direct reclaim in a infinite loop based on the
zone->all_unreclaimable flag looks scary.

--Ying

>
>
> --
> Kind regards,
> Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-04-24  2:06 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-23 20:56 [RFC PATCH] do_try_to_free_pages() might enter infinite loop Ying Han
2012-04-23 22:20 ` KOSAKI Motohiro
2012-04-23 23:18   ` Ying Han
2012-04-23 23:19     ` Ying Han
2012-04-24  1:31     ` Minchan Kim
2012-04-24  2:06       ` Ying Han [this message]
2012-04-24 16:36       ` Ying Han
2012-04-24 16:38         ` Rik van Riel
2012-04-24 16:45           ` KOSAKI Motohiro
2012-04-24 17:22             ` Ying Han
2012-04-24 17:17           ` Ying Han
2012-04-24  5:36 ` Nick Piggin
2012-04-24 18:37   ` Ying Han
2012-05-01  3:34     ` Nick Piggin
2012-05-01 16:18       ` Ying Han
2012-05-01 16:20         ` Ying Han
2012-05-01 17:06         ` Rik van Riel
2012-05-02  3:25           ` Nick Piggin
2012-06-11 23:33 ` KOSAKI Motohiro
2012-06-11 23:37   ` KOSAKI Motohiro
2012-06-14  5:25     ` Ying Han
2012-06-12  0:53   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALWz4iy0af11bSrgZdeCrXFt7ZauhpPAQkLj9D0v0hrANRXGug@mail.gmail.com \
    --to=yinghan@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=minchan.kim@gmail.com \
    --cc=minchan@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.