All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <kernel@kyup.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Linux MM <linux-mm@kvack.org>
Subject: Re: Softlockup during memory allocation
Date: Tue, 22 Nov 2016 16:46:10 +0200	[thread overview]
Message-ID: <3b418cf3-1714-be2b-9108-8b04f6884e95@kyup.com> (raw)
In-Reply-To: <20161122143228.GC6831@dhcp22.suse.cz>



On 11/22/2016 04:32 PM, Michal Hocko wrote:
> On Tue 22-11-16 15:30:56, Michal Hocko wrote:
>> On Tue 22-11-16 10:56:51, Nikolay Borisov wrote:
>>>
>>>
>>> On 11/21/2016 07:31 AM, Michal Hocko wrote:
>>>> Hi,
>>>> I am sorry for a late response, but I was offline until this weekend. I
>>>> will try to get to this email ASAP but it might take some time.
>>>
>>> No worries. I did some further digging up and here is what I got, which
>>> I believe is rather strange:
>>>
>>> struct scan_control {
>>>   nr_to_reclaim = 32,
>>>   gfp_mask = 37880010,
>>>   order = 0,
>>>   nodemask = 0x0,
>>>   target_mem_cgroup = 0xffff8823990d1400,
>>>   priority = 7,
>>>   may_writepage = 1,
>>>   may_unmap = 1,
>>>   may_swap = 0,
>>>   may_thrash = 1,
>>>   hibernation_mode = 0,
>>>   compaction_ready = 0,
>>>   nr_scanned = 0,
>>>   nr_reclaimed = 0
>>> }
>>>
>>> Parsing: 37880010
>>> #define ___GFP_HIGHMEM		0x02
>>> #define ___GFP_MOVABLE		0x08
>>> #define ___GFP_IO		0x40
>>> #define ___GFP_FS		0x80
>>> #define ___GFP_HARDWALL		0x20000
>>> #define ___GFP_DIRECT_RECLAIM	0x400000
>>> #define ___GFP_KSWAPD_RECLAIM	0x2000000
>>>
>>> And initial_priority is 12 (DEF_PRIORITY). Given that nr_scanned is 0
>>> and priority is 7 this means we've gone 5 times through the do {} while
>>> in do_try_to_free_pages. Also total_scanned seems to be 0.  Here is the
>>> zone which was being reclaimed :
>>>
>>> http://sprunge.us/hQBi
>>
>> LRUs on that zones seem to be empty from a quick glance. kmem -z in the
>> crash can give you per zone counters much more nicely.
>>
>>> So what's strange is that the softlockup occurred but then the code
>>> proceeded (as evident from the subsequent stack traces), yet inspecting
>>> the reclaim progress it seems rather sad (no progress at all)
>>
>> Unless I have misread the data above it seems something has either
>> isolated all LRU pages for some time or there simply are none while the
>> reclaim is desperately trying to make some progress. In any case this
>> sounds less than a happy system...
> 
> Btw. how do you configure memcgs that the FS workload runs in?

So the hierarchy is on v1 and looks like the following:

/cgroup/LXC/cgroup-where-fs-load-runs

- LXC has all but 5gb of memory for itself.
- The leaf cgroup has a limit of 2 gigabytes:

memory = {
    count = {
      counter = 523334
    },
    limit = 524288,
    parent = 0xffff881fefa40cb8,
    watermark = 524291,
    failcnt = 0
  },
  memsw = {
    count = {
      counter = 524310
    },
    limit = 524288,
    parent = 0xffff881fefa40ce0,
    watermark = 524320,
    failcnt = 294061026
  },
  kmem = {
    count = {
      counter = 0
    },
    limit = 2251799813685247,
    parent = 0xffff881fefa40d08,
    watermark = 0,
    failcnt = 0
  },


As you can see the hierarchy is very shallow.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-11-22 14:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-01  8:12 Softlockup during memory allocation Nikolay Borisov
2016-11-01  8:16 ` Nikolay Borisov
2016-11-02 19:00 ` Vlastimil Babka
2016-11-04  3:46   ` Hugh Dickins
2016-11-04 12:18 ` Nikolay Borisov
2016-11-13 22:02   ` Nikolay Borisov
2016-11-21  5:31     ` Michal Hocko
2016-11-22  8:56       ` Nikolay Borisov
2016-11-22 14:30         ` Michal Hocko
2016-11-22 14:32           ` Michal Hocko
2016-11-22 14:46             ` Nikolay Borisov [this message]
2016-11-22 14:35           ` Nikolay Borisov
2016-11-22 17:02             ` Michal Hocko
2016-11-23  7:44               ` Nikolay Borisov
2016-11-23  7:49                 ` Michal Hocko
2016-11-23  7:50                   ` Michal Hocko
2016-11-24 11:45                   ` Nikolay Borisov
2016-11-24 12:12                     ` Michal Hocko
2016-11-24 13:09                       ` Nikolay Borisov
2016-11-25  9:00                         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b418cf3-1714-be2b-9108-8b04f6884e95@kyup.com \
    --to=kernel@kyup.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.