All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <kernel@kyup.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Linux MM <linux-mm@kvack.org>
Subject: Re: Softlockup during memory allocation
Date: Thu, 24 Nov 2016 15:09:38 +0200	[thread overview]
Message-ID: <a655e607-91c5-173c-ec3a-e211df598f92@kyup.com> (raw)
In-Reply-To: <20161124121209.GE20668@dhcp22.suse.cz>



On 11/24/2016 02:12 PM, Michal Hocko wrote:
> On Thu 24-11-16 13:45:03, Nikolay Borisov wrote:
> [...]
>> Ok, I think I know what has happened. Inspecting the data structures of
>> the respective cgroup here is what the mem_cgroup_per_zone looks like:
>>
>>   zoneinfo[2] =   {
>>     lruvec = {{
>>         lists = {
>>           {
>>             next = 0xffffea004f98c660,
>>             prev = 0xffffea0063f6b1a0
>>           },
>>           {
>>             next = 0xffffea0004123120,
>>             prev = 0xffffea002c2e2260
>>           },
>>           {
>>             next = 0xffff8818c37bb360,
>>             prev = 0xffff8818c37bb360
>>           },
>>           {
>>             next = 0xffff8818c37bb370,
>>             prev = 0xffff8818c37bb370
>>           },
>>           {
>>             next = 0xffff8818c37bb380,
>>             prev = 0xffff8818c37bb380
>>           }
>>         },
>>         reclaim_stat = {
>>           recent_rotated = {172969085, 43319509},
>>           recent_scanned = {173112994, 185446658}
>>         },
>>         zone = 0xffff88207fffcf00
>>     }},
>>     lru_size = {159722, 158714, 0, 0, 0},
>>     }
>>
>> So this means that there are inactive_anon and active_annon only -
>> correct?
> 
> yes. at least in this particular zone.
> 
>> Since the machine doesn't have any swap this means anon memory
>> has nowhere to go. If I'm interpreting the data correctly then this
>> explains why reclaim makes no progress. If that's the case then I have
>> the following questions:
>>
>> 1. Shouldn't reclaim exit at some point rather than being stuck in
>> reclaim without making further progress.
> 
> Reclaim (try_to_free_mem_cgroup_pages) has to go down all priorities
> without to get out. We are not doing any pro-active checks whether there
> is anything reclaimable but that alone shouldn't be such a big deal
> because shrink_node_memcg should simply do nothing because
> get_scan_count will find no pages to scan. So it shouldn't take much
> time to realize there is nothing to reclaim and get back to try_charge
> which retries few more times and eventually goes OOM. I do not see how
> we could trigger rcu stalls here. There shouldn't be any long RCU
> critical section on the way and preemption points on the way.
> 
>> 2. It seems rather strange that there are no (INACTIVE|ACTIVE)_FILE
>> pages - is this possible?
> 
> All of them might be reclaimed already as a result of the memory
> pressure in the memcg. So not all that surprising. But the fact that
> you are hitting the limit means that the anonymous pages saturate your
> hard limit so your memcg seems underprovisioned.
> 
>> 3. Why hasn't OOM been activated in order to free up some anonymous memory ?
> 
> It should eventually. Maybe there still were some reclaimable pages in
> other zones for this memcg.

I just checked all the zones for both nodes (the machines have 2 NUMA
nodes) so essentially there are no reclaimable pages - all are
anonymous. So the pertinent question is why process are sleeping in
reclamation path when there are no pages to free. I also observed the
same behavior on a different node, this time the priority was 0 and the
code hasn't resorted to OOM. This seems all too strange..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-11-24 13:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-01  8:12 Softlockup during memory allocation Nikolay Borisov
2016-11-01  8:16 ` Nikolay Borisov
2016-11-02 19:00 ` Vlastimil Babka
2016-11-04  3:46   ` Hugh Dickins
2016-11-04 12:18 ` Nikolay Borisov
2016-11-13 22:02   ` Nikolay Borisov
2016-11-21  5:31     ` Michal Hocko
2016-11-22  8:56       ` Nikolay Borisov
2016-11-22 14:30         ` Michal Hocko
2016-11-22 14:32           ` Michal Hocko
2016-11-22 14:46             ` Nikolay Borisov
2016-11-22 14:35           ` Nikolay Borisov
2016-11-22 17:02             ` Michal Hocko
2016-11-23  7:44               ` Nikolay Borisov
2016-11-23  7:49                 ` Michal Hocko
2016-11-23  7:50                   ` Michal Hocko
2016-11-24 11:45                   ` Nikolay Borisov
2016-11-24 12:12                     ` Michal Hocko
2016-11-24 13:09                       ` Nikolay Borisov [this message]
2016-11-25  9:00                         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a655e607-91c5-173c-ec3a-e211df598f92@kyup.com \
    --to=kernel@kyup.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.